(For Hons, Degree and Postgraduate classes of all Indian Uuiversities) 



Dr. H. C. Slnha, Ph.D. 

Reader & Head of Department of Mathematics 

Bareilly College, Bareilly 

(Ex. Convener; Mathematics Board of Studies, Agra University) 

& 

L S. Varshney 

Department of Mathematics & Statistics 
K.G.K. (Post Graduate) College, Moradabad 



leading educational publishers, 

MEERUT CITY. 




Published by 

S.V.Nath. 

Partners 

Jai Prakash Nath & Co. 
Meerut-2 


{ Office 73259 
Res. 72214 


First Edition K 


Price : Rs. 29 50 only 



r e «' 


I 


; 


f / « h-j 

i O >. . / 

All Rights Reserved with the Authors 


Printed at 

Satish Printers & Agrawal Press 
Meerut 


Dedicated to 

Mv. A. A.V<VRMAW. 

D »- r - D , N . Tul 1 L 

{Formerly, Head of the Statistics Department 

PATNA UNIVERSITY 
& 

now a Stat : stician in U.N.O.) 




PREFACE 


The original impetus for writing this work resulted from a 
course of lectures we gave on the subject in Bareilly College, 
Bareilly and K.G.K. College, Moradabad. Several collegues 
and pupils then suggested that we should produce a book on 
Mathematical Statistics. 

In this text book we have attempted to present in a logical 
pattern of development, a unified comprehensive coverage of the 
basic foundations of the subject matter. It emphasizes the 
concepts and basic principles of mathematical statistics that are 
common to application in most branches of the applied mathema¬ 
tics. The presentation demonstrates the universality of mathema¬ 
tical expression protraying physical phenomena. It introduces a 
high level of general theoretical treatment of fundamentals which 
are presented at an elementary level. Thus the reader gets a much 
clearer understanding and better appreciation of the underlying 
physical assumptions and limitations. Many illustrative examples 
are included to enhance the reader’s thorough understanding of the 
subject and develop his ability to analyse new and challanging 
problems in this field. The book also contains a wide variety ot 
graded problems ranging from routine applications to one which 
will extend to the very best student. 

The order of the contents of the text is believed to present a 
a rational classification of the topics treated. The introductory 
chapter begins with a discussion of the history of the word 
‘statistics’. The regimes of statistics are presented and catagorized 
here so that the reader can grasp a general picture af statistics and 
its related fields before attempting to study any particular branch ol 
the subject. 

During our study of the subject, we have been influenced by 
the writings of a large number of authors including W. Feller, 
H. Cramer, S, Goldberg, B. Gnedenko, M G. Kendall and Stuart, 
M E. Munroe, J. V. Uspensky, E. Parzen and S. S. Wilks, and we 
are deeply indebted to all of them. No attempt has been made to 
give a complete list of all that has been published on mathematical 



statistics as this would make the volume too cumbersome but we 
hope that the references which have been cited will facilitate the 
task of the reader who wishes to acquire fuller information. 

In a work of this nature it is unavoidable that there should 
be some errors of judgement and also oversights and omissions. 
We should appreciate the suggestions und criticisms of those who 
use this book. 

fn the particular, we wish to take this opportunity to express 
our sincerest appreciation of Sri Hardwari Lai (Department of 
Mathematics, K.G.K. College, Moradabad) who gave us inspira- 
ration throughout the preparation of the book. We greatly 
appreciate the efforts of Sri D. K. Gupta (son of L. S. Varshney) 
who read proofs. 

Finally, we take this opportunity to thank our publishers M/S 
Jai Prakash Nath & Co , and printers for the cooperation shown 
in the production of this book. 

15 July, 1974 

( H.C. Sinha (116, Har Bhawan, Bhoor Bareilly) 
\ L.S. Varshney (Subji Mandi, Gan], Moradabad.) 



Chapter 


Page 






ubject 

1. Introduction! 1 — 

2. Frequency Distribution and Measures of Location 9 — 
3 Measures o ^Distribution^ Skewness and Kurtosis, 47- 

Moments of frequency distributions 


8 

46 

88 


4. 

Theory of Probability 

89—172 

5. 

Mathematical Expectation 

173-224 

6. 

Special Discrete Probability Distributions 

225-265 

7. 

Continuous Probability Distributions 

266-284 

8. 

Special Continuous Distributions 

285- 352 

9. 

Regression and Correlation; Curve Fitting 

353—400. 

10. 

Multiple and Partial Correlation 

401-418 

11. 

Attributes 

1-31 

12 . 

Theory of Sampling 

32-76 

13. 

Exact Sampling Distributions (t. z, F Etc.) and 

77-130 


Tests of Significance 

14. 

Test of Significance based on Chi-Square 
Distribution 

131-158 

15. 

Estimation 

159-233 

16. 

Confidence Interval 

234-257 

17. 

Tests of Hypothesis 

258-277 

18. 

The Calculus of Finite Differences 

278-348 

19. 

Analysis of Variance 

349-359 

^4 



Symbols frequently used in the Book. 


=> implies 

* belongs 

U union of two sets 

D intersection of two sets 

U universal set 

S sample space 

^ or ^ approximates to 
X random variable 


9 


-*V2 

V(2 nf 

_!_p 

V(2t;)J_ 00 



X 



1 


Introduction 


11 History of the word ‘Statistics’. Statistics, as a matter of 
fact, is as old as the human society itself. The words ‘Statistics’, 
‘statistician’ and ‘statistical’ are merely a century old. But they 
are, however, in use since long past. The words ‘statist’, ‘statistics’ 
and ‘statistical’ seem to be all derived, more or less indirectly, from 
Latin word 'Status' of a political state, or the Italian word 
* statista', or the German word 1 statistik ’. The term statistics used 
by German writers of the eighteenth century, by Zimmermann and 
by Sir John Sinclair, meant simply (he exposition of the note¬ 
worthy characteristic of a state The government, in ancient times, 
used to gather an information regarding population and property 
of wealth of the country. The information of population helped 
the government to have an idea of the man-power of the country 
whereas that of property of wealth provided her a basis to impose 
new taxes and levies. The word ‘statist’ occurs in Hamlet (1602, 
Act 5, scene 2), Cymbeline (1610 or 1611, Act 2, scene 4) and 
Paradise Regained (1671). The term‘Statistics’is found in Che 
Elements of Universal Erudition , by Baron J F. Von Bielfeld, 
translated by W. Hooper, M D One of its chapters is entitled 
Statistics, and contains-a definition of the subject as ‘The science 
that teaches us what is the political arrangement of all nn Jem 
states of the known world'. A rather wider definition of‘statistics’ 

occurs in the preface to Political Survey of the Present State of 
Europe by E.A.W Zimmermann. 

During the reign of Ch tndra Gupta Maurya (324-300 B C.), 
there existed in India a very efficient system of collecting official 
and administrative statistics. Kautilya’s Arthshastra confirmed that 
even before 300 B.C , a system of collecting ‘vital statistics’ and 
registration of births and deaths was in existence. Raja Todarmal, 
who was the land and revenue minister during Akbar’s reign 
(1556-1605 A.D.), maintained a very good record of land and 


2 


Mathematical Statistics 


agricultural statistics. A detailed account of administrative and 
statistical surveys conducted during A.kbar’s reign was in vogue. 
This fact is evident from the study of Ain-i-Akbari written by 
Abul Fazl (in 1596-97), one of the nine gems of Akbar. 

In England, statistics were the outcome of Napoleon Wars. 
The W; rs compelled the government to have the systematic 
collection of numerical data in order to assess the revenues and 
expenditure with greater precision and then to levy new taxes to 
meet the cost of war. 

The ‘ Vital Statistics' saw its light in the seventeenth century. 
Captain Johan Graunt of London (1620 1674) was known as the 
father of Vital Statistics. He was the first man to study the 
statistics of births and deaths. The idea of life-insurance was a 
consequence of computation of morality tables and expectation of 
lile at different ages ^y a number of different people, among whom 
the names of Casper Newman, Sir William Petty (1623-1687), 
Janies Dodson and Dr. Price are worth noting. 

The introduction of‘Theory of Probability’ and 'Theory of 
Games and Chance’ led to the theoretical development of the 
modern statistics. It is a w.ll known fact that the mathematicians 
and gamblers of France, Germany and Euglind were the chief 
contributors to Theory of Probability and Theory of Games. The 
famous 'Problem of Points’ posed by the gambler Chevalier-de- 
Mere was solved by the French mathematician Pascal (1621-1*62) 
in consultation with another French mathematician P. Fermat 
(1601-1665). His study of the problem laid the foundation of the 
theory of probability which is, in fact, the backbone of the modern 
theory of probability. 

1 he effort of Pascal led to find the properties of binomial 
expansion. It is he w - imented mechanical computation machine. 
Janies Bernoulli 1 1654-1705) who was the first man to write the 
treatise on the ‘Theory of Probability’, De-Moivre (1667-1 54) 
who studied probabilities and annuities and published his impor¬ 
tant work * The Doctrine of Chances' in 1718, Laplace (1749-1827) 
who published his monumental on the theory of probability, and 
Giuss (1777-1855) who has been considered to be the most original 
of all writers on statisticil subjects, and provided the principles of 
least squares and the nor.nal I iw of errors, were other notable 
contributors in the field of probability. The realer should keep in 
mind that Euler, Lagrange, Biyes, A. MirkoiT, Khintchine and 


Introduction 


3 


Kolmogorov were the prominent mathematicians of 18th, 19th 
and 20th century who added a lot to the contribution in the field 
of probability. It is true that modern veterans in the development 
of Statistics are Englishmen. Francis Galton (18.2-1921) worked 
on regression and utilized statistical methods in the field of 
Biometry. Karl Pearson (1857-1936) was the founder of the 
greatest statistical laboratoy in England in 1911 and is the pioneer 
in correlation analysis. It is important to note that his discovery 
of ‘the Chi square test’ which is, in fact, the first and most impor¬ 
tant of modern tests of significance, won for statistics a place as 
science. In 1908 W.S. Gosset discovered the Student’s ‘t’ distribu¬ 
tion. Sir Ronald A. Fisher (1890-1162), is known as the ‘Father 
of Statistics’. It is he who applied statistics to various fields, for 
example, genetics, biometry, education, agriculture, etc. In 
addition to this, he is the pioneer in introducing the concepts of 
’Point Estimation’ principle of maximum likelihood, efficiency, 
sufficiency, etc., ‘Fiducial Inference* and ‘Exact Sampling Distribu¬ 
tions’. He also pioneered the study of 'Analysis of Variance’ and 
‘Design of Experiments’. 

1*2. Definition of Statistics. The fundamental notion in 
statistical theory is that of a set or an aggregate, a concept for 

which statisticians use a special word 'population'. Population 

denotes any collection of objects whether animate or inanimate. 

The science of Statistics deals with the properties of popula¬ 
tions. The statistician, like nature, is mainly concerned with the 
species and is careless of the individual. He is not interested 
statistically speaking, in whether some particular individual h is 
brown eyes or is a forger, but rather in how many of the indivi¬ 
duals have brown eyes or are forgers and whether the possession 
of brown eyes goes with a propensity to forgery in the population. 

Now wc define Statistics to be the branch of scientific method 

which deals with the properties of populations. Note that this 

definition is rather too general. Statistics deals only with numerical 

properties. The reformulation of the definition of Statistics runs 
as follows 

Statistics is the branch of scientific method which deals with 
the data obtained by conn ting or measuring the properties of 
populations. 

This definition is still too general. A set of logarithm tables 
is a population of numerals. But it is hardly a subject for statistical 



4 


Mathematical Statistics 


enquiry because each numeral is determined according to mathema¬ 
tical laws. The statistician is only interested in populations 

occuring in Nature. Hence we reformulate our definition as 
follows : 

Statistics is the branch of scientific method which deals with 

the data obtained by counting or measuring the properties of 

populations of natural phenomenon, fn this definition ‘natural 

phenomenon’ includes all the happenings of the external world, 
whether human or not. 

It is important to note that ‘Statistics’, the name of the 

scientific method, is a collective noun and takes the singular 

whereas statistic is defined to be a function of the observations 

in a sample from some population. ‘Statistic* in this sense takes 
the plural ‘statistics’. 

In the words of Webster Statistics is defined as ' classified ’ 
facts denoting the conditions of people in a state ..especially those 
facts which can be stated in numbers or in any other tabular or 
classified arrangement 

Bow ley defines statistics as 'numerical statements of facts in 
any department of enquiry placed in relation to other ’. He himself 
defines Statistics in the following three different ways : 

(a) Statistics is the science of counting. 

(b) Statistics is the science of averages. 

(c) Statistics is the science of the measurement of social 
organisms, regarded as a whole in all its manifestations. 

Boddington says that ‘Statistics is the science of estimates 
and probabilities'. 

A more exhaustive definifoa given by Prof. Horace Secrist 
runs as follows : 


By Statistics we mean aggregate of facts affected to a marked 
extent by multiplicity of causes numerically expressed . enumerated 
or estimated acceding to reasonable stan lards of accuracy 
collected in a systematic manner for a pre- Jeter minedpurpose and. 
placed in a relation to each other". 

Other definitions are as follows : 

'The science of Statistics is the method of Judging collective 
natural or social phenomenon from the results obtained from the 
analysis or enumeration or collection of estimates '. —j^| 



introduction 


5 


‘Statistics is the science which deals with collection , classifica¬ 
tion and tabulation of numerical facts as the basis for explanation , 
Rescript ion-and comparison of phenomena'. — Lovilt 

‘Statistics is the science which deals with the collection , 
•analysis and interpretation of numerical data 

—Crcxton and Cowden 

1*3. Importance and Scope of Statistics In the modern age 
Statistics is regarded not only as a device to collect numerical data 
‘but also a means to develop sound techniques to handle them and 
to analyse and draw valid conclusions from them. Its scope is 
very wide. It has its wide applications in various diversified 
spheres—social, economic and political, in almost all sciences 
social as well as physical —for instance economics, psychology, 
foiology, education, sociology, etc. It seems rather impossible to 
enumerate any sphere of human activity where statistics is 
•not used. 

Statistics and Planning. The modern age is, in fact, the ‘age 
of planning’. The planning is successful depends upon the correct 
analysis of complex statistical data. As a matter of fact, statistics? 
is indispensable to planning. We see that almost all over the world, 
governments particularly of budding economics, are using planning 
as a basis of economic development. 

Statistics and Economics. A variety of economic problems, 
for instance, wages, prices, analysis of time series and demand 
analysis can be solved by applying statistical data and techniques 
of statistical analysis. It has an important role in the economic 
development. Economic Statistics and Econometrices are the out¬ 
comes of the wide applications of mathematics and statistics in the 
study of economics 

Statistics and Business Statistics proves to be an indispens¬ 
able tool of production control. Business executives depends to a 
great extent on statistical techniques to study the needs and desires 
of the consumers. The accuracy and precision and forecasting of 
■a business man leads to his success whereas the result of faulty and 
inaccurate analysis of various causes affecting a particular pheno¬ 
menon might lead to his disaster. 

Statistics and Industry. The wide applications of statistics in 
industry are found in ‘Quality Control’. Statistical tools, viz , 
inspection plans, control charts, etc. arc of extreme importance m 



6 


Mathematical Statistics 


production engineering. In the investigation of the quality of 
manufactured parts, the model might be one that predicts the per¬ 
centage of defective parts that can be expected in the manufactur¬ 
ing process 

statistics and Mathematics. The theory of statistics can be 
treated as a branch of mathematics in which probability is the 
basic tool. Statistic* and mathematics are inter-linked to each 
other. The knowledge of mathematics is essential to study 
statistics. Generally statisticians are good mathematicians. For 
instance, Bernoulli, Pascal, Laplace, De-Moivre, Gauss, R A. Fisher 
were first talented mathematicians, and they were main contributors 
to statistics. Recent advancements in statistical techniques are the 
outcomes of the wide applications of advanced mathematics. In 
the words of Connor, Statistics is the branch of applied mathema¬ 
tics which specialises in data. 

Statistics and Biology. Francis Gallon was the first man to 
study the association between statistical methods and biological 
theories in his work on ‘Regression’. Prof. Karl Pearson 
remarked that the whole theory of heredity rests on statistical 
basis. He says that the whole problem of evolution is the problem 
of vital statistics, a problem of longevity , of fertility , of health , 
of disease and it is impossible for the evolutionist to proceed 
without statistics as it would be for the Registrar General to discuss 
the national mortality without enumeration of the population a 
classification of deaths and a knowledge of statistical theory. 

Statistics and Astronomy. The Principle of Least Square 
facilities the development of theory of ‘Normal Law of Errors' 
for the study of the movements of stars and planets. 

Statistics, Education and Psychology. Statistics has wide 
applications in education and psychology. Factor Analysis helps us 
in determining the reliability and validity of a test. A new subject 
called ‘Psychometry’ has come into existence as a consequence of 
Factor Analysis. 

Statistics and War. The theory of ‘Decision Functions’ helps 
military and technical personnels to plan maximum destruction 

with minimum effort. 

That is to say, the science of Statistics is inter-linked in some 
way or other with almost all the sciences social as well as physical 
In the words of Bowley, a knowledge of Statistics is like a foreign 


introduction 


7 


language or of algebra : it may prove of use at any time under any 
circumstance. 

1.4 Limitations of Statistics. In almost each sphere of human 

life, statistics has wide applications. Its important limitations are 
as fallows : 

(a) The study of qualitative phenomenon is impossible : 
Statistics is, infact, a science which deals with a set of mumerical 
data. It studies only those subjects of enquiry which are capable 
of quantitative measurement. Statistical analysis can not be applied 
to qualitative phenomenon such as h nesty, poverty, culture etc. 
because they can not be expressed numerically. Statistical tech¬ 
niques are, however, applicable indirectly in case when qualitative 
experessions are reduced to precise quantitative ones. For ins¬ 
tance, the study of the intelligence of a group i f candi ates is 
possible on the basis of their scores in a certain test. 

(a) The study of individual is not possible. Statistics studies 
a population of objects but speaks nothi ig about the individuals. 
When we deal with individual items, then they don’t constitute sta¬ 
tistical data and hence there arises no question of statistical enq¬ 
uiry. Statistical analysis is applicable to only those problems in 
which the study of group characteristics is possible. 

(c) Statistical laws are not exact like the laws of physical 

and natural sciences. Statistical analysis allows us to talk only 
in terms of probability but not in terms of certainty. 

(,di Statistics is liable to be misused Statistical methods are 
the most dangerous tools in the hands of the inexperts. Statistics 
is one of those sciences whose adepts must exercise the self rest¬ 
raint of an artist. In the words of K.ing, Statistics are like clay 
of which one can make a God or Devil as one pleases. The use 
of statistical tools by inexperienced and untrained persons may 
lead to very fallacious conclusions. 

' 1-5. Basic Statistical Concepts. When the statistician obtains 

statistical data and finds himself satisfied that they are reliable 

enough to prevent him to proceed, then he wishes to lick them 
into shape by the only process called condensation Statistics are 
infact numerical facts derived from statistical data Statistical data 
come out of observation of characteristics on individuals. We use 
the term individual in general sense. 

A quantitative characteristic is known as a variable. Note 
that height, weight, age, income etc. of people are note worthy 



8 


Mathematical Statistics 


examples of quantitative characteristics. A variable is also defined 
to be a function which does not retain the same value throughout 
the mathematical operation. 

Variables are of two kinds : (a) discrete variables and 

(b) continuous variables. 

A discrete variable is a variable that can assume only a finite 
number,or an infinite sequence of distinct values. This implies 
that the values can be arranged in a definite order. Number of 
accidents, number of patients in a hospital, number of members in 
a family, number of books in a library, etc. are noteworthy 
examples of discrete variables. 

A continuous variable is that variable which can assume any 
value in some interval or intervals ( i.e. whicn takes any numerical 
value —integral as well as fractional). Weights, lengths, tempera¬ 
tures, and velocities, which are essentially variables involving 
measurement are well known examples of continuous variables. 

Population In Statistics, population denotes a set of objects, 
animate or inanimate, under study. A population of a finite 
number of individuals we call a finite population whereas an 
infinite population is a population of infinite number of individuals. 

A finite sub-set of statistical individuals in a population is 
called a sample , and the number of individuals in the sample is 
known as the sample size. 

The statistical constants, for example, mean (**), variance (a 2 ) 
etc., computed from the population values are known as ' para¬ 
meters \ The word parameter is used here in a general sense. 
Note that in narrower sense, parameter is nothing but a measure 
occuring in the probability distriVu/'.m of the variable. Note that 
* m ' is the parameter in the Po- ;stribution whereas m and o 
are the parameters in the Normal uistribution. 

The corresponding statistical constants, for example : mean 
(.v), variance (j*) etc., computed from the sample, values, are 
known as ‘statistics’. A statistic is thus a function of the sample 
values only. 



2 


Frequency Distribution and 

Measures Of Location 


21. Frequency Distribution. The manner in which the class- 
frequencies are distributed over the class-intervals is spoken of as 
the frequency distribution. Look at the following table which is 
the frequency distribution of marks of 49 students in a class. 
Marks Number of students Cumulative 

(Class) (Frequency) Frequency 


5-10 

10-15 

15-20 

20-25 

25—30 

30-35 

35-40 

40-45 


5 

6 

15 

10 

5 

4 

2 

2 


5 

11 

26 

36 

41 

45 

47 

49 


This type of representation of frequencies is called a grouped 

frequency distribution. In the above example, marks of 49 students 

have been divided into 8 groups viz, 5 — 10, 10—15,..., 40 — 45. 
The groups are also known as classes and the end figures are 

called class limits. The figures on the left side of classes are 
called lower limits and the figures on the right side of classes are 
called upper limits. The difference between the upper and lower 
limits of a class is called the width of the class or the magnitude ot 
the class interval. The number of individuals or observations in 
each class is called the class frequency. Here we presume that the 
class frequencies are uniformly distributed over the correspon mg 
class intervals and so we will not involve ourselves in any serious 
error if the class intervals are replaced by their mid-values. Tie 
cumulative frequency corresponding to a class is the total num er 
of observations less than or equal to the upper limit of the class 
l.e. it is the total of all the frequencies upto and including that 



10 


Mathematical Statistics 


class e g. the cumulative frequency for marks obtained less than 
or equal to 15 is 11. In the above example the variate values are 
marks. If x denotes marks then the class intervals may be 
expressed either as 5^x<lO or as 5<.v^l0, but this should be 
made definite at the very beginning. For the calculation of 
various parameter values of the data, it is found convenient to 
replace class-intervals by their mid-values e.g. in the above tab'e 
the variable v (the number of marks) assumes the set of values 
7-5, 12 5, 17 5, 19*5, ..,42 5. The frequencies are generally denoted 
by the set of values/i where /— 1, 2, 3, . & the variate values 

by xt. 

The variable x< occurs with frequency f where we can take 

n 

i = l, 2, ..., n. The total frequency is denoted by N=S /< and the 

<-i 

relative frequency for the variate x< is denoted by fJN which we 
may denote by p i% The weighted arithmetic mean x of the N values 
of the variable is denoted by 

J n n 

*=- aT 27 fx t — 27 p ( x <m 
N i-i <-i 

since the value x t occurs/< times. 

The following points must be borne in mind while grouping or 
tabulating a frequency distribution. 

(i) The class intervals should be of equal width since with 
this the comparison over various class-intervals is easily possible. 
Equality in width also ensures ease in computing the parameter 
values of the population. However, in some cases it may be 
desirable to use unequal class intervals e.g. in the case of the 
number of deaths of children, as children in the early years of age 
are more prone to death. 

(ii) The number of class-intervals should not ordinarily 
exceed 20 and should not, in general be less than 10 with more 
than 20 class intervals the computations become unnecessarily 
tedious and whereas with less than 10, a great deal of accuracy 
may be lost 

(iii) The class-intervals must be defined with precision. It is 
necessary to make clear whether the class a-b means a<x<& 
or a<x^b. For continuous variates, it is desirable to take into 



Frequency Distribution and Measures of Location 11 

account the accuracy of measurements. If measurements are taken 
to the nearest eighth of an inch the class limits for the class inter¬ 
val “57 and less than 58“ should be taken 5c57^ and if they 

are taken to the nearest quarter of an inch then the class limi s 

7 7 

should be taken as 56-—5 7 —. 

(iv) The class interval should be an integer as far as possible. 

2 2. Frequency Polygon, Histogram. Suppose the sample of 
observations a< have relative frequencies /(.y,). The assemblage of 

relative frequencies /(*<) for the 
sample is called the frequency distri¬ 
bution of x in that sample. If the 
variate x is discontinuous, as for 
example the number of flowers on 
stalks, the number of beans in bean- Frequency polygon 

pods, we obtain separate plotted points (x,f(x) ) which, joined to 
their neighbours, form a freq tency polygon. 

In drawing the histogram of a given grouped frequency 
distribution, we mark off along the axis of x all the class interv als 
on a suitable scale. With the class-intervals as bases, we draw 
rectangles which are proportional in area to the frequencies in the 
class intervals. For equal class-intervals, the heights of the 
rectangles will be proportional to the frequencies, while for un¬ 
equal class-intervals, the heights will be proportional to the ratios 
of the frequencies to the widths of classes. If the n measurements 
x 1# x 2 ,...,Xn occur with relative frequencies fi/N, / 2 /7V,.../„/jV, the 
histogram drawn for this sample is as given below. 






12 


Mathematical Statistics 


The histogram furnishes a rough approximation to the ideal 
probability curve. 

2'3. Frequency-curves. If in a grouped frequeucy distribution 
the class intervals are made smaller, and at the same time the 
number of observations increased such that the class-frequencies 
remain finite, the frequency polygon and the histogram will 

approach more and more closely to a smooth curve. Such an idea' 
limit to the polygon or the histogram is called a frequency-curve. 
It is a concept of great importance in statistical theory. In the 
frequency curve the area between any two ordinates is proportional 
to the number of observations falling between the corresponding 

values of the variable. Thus the integral \ f{x) dx indicates the 

J a 

number of observations falling in the interval 

2 4. Cumulative frequency curve or the ogive. With the help 
of cumulative fiequency table, we can plot the points with the 
upper limits of the classes as abscissa and the corresponding 
cumulative frequences as ordinates. The curve obtained by 
joining these points in a free-hand manner is known as the 
cumulative frequency curve or the ogive. If the cumulative 
frequencies are expressed as percentages of the total frequency and 
if they now indicate the ordinates of the points above, the curve 
obtained by joining such points will be called the percentage 
cunut'Hive frequency curve. Such a frequency curve is useful in 
comp»ring different frequency distributions as they are now adjus¬ 
ted to a uniform standard. 

2 5. Types of Frequency curves. Some of the common 
types .if frequency curves that we come across generally are given 
below. 

(a) Symmetrical curve. In this type the class-frequencies 
decrease to zero symmetrically on either side of a central maxi¬ 
mum. The two branches of the curve 

on either side of the central ordinate 

are exactly similar. The important 

type of a symmetric il curve is a bell¬ 
shaped curve with a singIe smooth hump Symmetrical 

at the centre and which tails off gradually at either end. 




Frequency Distribution and Measures of Location 


13 


(b) Moderately skew curve. A curve is said to be skew if it 
lacks in symmetry. In this case the class frequencies decrease 



Negative skew Positive skew 


with greater rapidity on one side of the maximum than on the 
other. It may have the “long tail’* to the positive or right side, 
in which case it is said to be positively skew ; or to the negative or 
left side, in which case it is negatively skew. 

(c) Extremely Asymmetrical or J-shaped and U-shaped 
curves. When the class frequencies run up to a maximum at one 



Neg. /-shaped Pos. /-shaped {/-shaped 

end of the range, they form a /-shaped curve. In this case the 
curve does not descend at all on one side or the other. A curve 
so extremelv skew is called positively J-shaped or negatively 
J-shaped as the case may be. In the rare type of distribution 
called the U-shaped curve the minimum ordinate is in the middle 

region. 

Note. The area under a probability curve measures the total 
probability of all possible values of r, and is therefore equal to 1. 
The assemblage of values of probabilities <f>[x), for all the possible 
values Xi of x that may occur in any system S , is called the 
probability distribution of .v in 5. For example, if a symmetrical 
coin is thrown 400 times and the runs of 1 head, 2 heads, 3 
heads...are noted, the probability distribution thus obtained is as 

given below. 

x (number of runs of heads) 1 2 3 4 5 6 7 8 Total 

n<f> (number of throws multi- 50 25 13 6 3 2 1 0 10l> 

plied by respective probabi¬ 
lities) of runs of heads. 



14 


Mathematical Statistics 


Note : The respective probabilities of runs of 1 head, 2 heads, 

3 heads, are $x£x£=§, *x£x$x| = £e» 32 * el. • • 

Hence corresponding to the first run of three consecutive 
heads, /i<£=400x£ = 50, and corresponding to the second run of 
four consecutive heads, n<f>=A00 X=25, etc. 

2 6. Sample Space. Whenever an experiment is performed, 
the investigator is interested in its outcomes. An outcome is any 
one of the possibilities that miy be expected from the experiment. 
We require that every possible outcome of an experiment i9 
enumerated. For example, in the coin-tossing experiment there 
are two p ^ssible outcomes t heads and tails. The totality of all 
outcomes forms a universal set which is called the sample space. 
Each outcome is a point ol ihe sample space. 

Each conceivable outcome of a conceptual experiment that 
can be repeated under similar conditions is called a sample point, 
and the totality of conceivable outcomes {or sample poit ts) is 
called the sample space. 

For example if a random experiment consists of throwing a 
oin two times, there are four conceivable outcomes : (//, H ), 
(//, T), (T, //), ( T. T). Here there are four sample points which 
make up the sample space. 

A sample S is called discrete if it contains either (I) a finite 
number or points or i2) an infinite number of points {countable 
infinity) which can be put into a oie to one correspondence with the 
positive integers. The sample space corresponding to a single 
throw a die is finite. On the other hand, the sample space corres¬ 
ponding to an experiment of throwing the die until a 6 appears is 
an infinite space. 

A sample .'pace S • iaining a non-denumerable number of 
points is called a continuous sample space, in this case the range 
of the points (elements) overs a continuum of values in contrast 
with the discrete set of values in the discrete sample space. For 
example if the random experiment consists in observing the length 
of life ol electron tubes, the outcome can be any positive number, 
and so the sample space is continuous 

A sub set of a sample space is called an event. Thus, an 
event is a subset of a sample space containing any number of 
points or outcomes (see the Fig.) 



Frequency Distribution and Measures of Location 


15 


An event which cont¬ 
ains no outcomes is called 
a null set or an empty 
set and represents an 
event that is impossible. 
An event containing all 
sample points is an event 
that is certain to occur. 



This is denoted by the univeral set U , which means that the event 
under consideration is sure to happen. The outcome of an event 
implies the occurrence of any one of its possible outcomes. 

2.7. Random-variable. A random variable is a real-valned 
function defined over the sample space of a random exteriment. 
Let £ be an experiment and S a sample associated with the experi¬ 
ment. A function X assigning to every element s£S, a real 
number X (5), is called a random variable (one-dimensional 
random variable). 

If s is a point in the sample S and X is a random variable 
then X(s) is the value of the random variable at s. In most of 
our discussions of random variables we need not indicate the 
functional nature of X. We are usually interested in the possible 
values of X , rather than where these values come from. For 
example suppose that we toss two coins and consider (he sample 
space associated with this experiment. The sample space here 
consists of four points •*! = (//, //), 5 a =(//, T)\ 5 3 --(7’, //); 

Si=(T, T). Define the random variable X as follows : X is the 
number of heads obtained in the two tosses. Hencj X(s i) = 2, 

*W=1. ATy 3 ) = l, and ^(j 4 ) = 0 

sample space of £ = possible values of X. 



The space Rx, the set of all possible values of X, is sometimes 
called the ranee space. In a sense Rx can be considered as another 
sample space. The (original) sample space S corresponds to the 
(possibly) non-numerical outcome of the experiment, while Rx is 
the sample space associated with the random variable X, represent- 



16 


Mathematical Statistics 


ing the numerical characteristic which may be of interest. If 

XI ' S ' 1 Th^correspondence between a point of S and a point in the 
coordinate space is designated by a mathematical function. This 
function is termed a random variable. Generally, we denote random 
variables by capital letters such as X and Y, and their speu 
values by the same letters in lower case. A random vana e 

assumes different values *a,•••»*«—which are P 0 *® s °. 

coordinate space. The coordinate space may be a one-dimensional 

or multi-dimensional space. The random variable may take a 
finite number of /i-triple values or infinitely many. A random 
variable is called discrete if it assumes only a finite or a denumer¬ 
able number of values and a random variable is called continuous 
if it assumes a continuum of values. 

2 8. Discrete Probability Functious and Distribotions. 

Let X be a discrete random variable. Hence Rx , the range 

space of X consists of at most a countably infinite number of 
values. .Y,,.r a •• • With each P ossible outcome x t we associate a 
number />(*,) = *>( *=*<)» called the probability of The number 
p[x { ), /=1,2,...must satisfy the following conditions : 

(a' p(x t )> 0 for all i. 


(b) 2 plXi) = 1 

The function p defined above is called the probability function 
(or point probability function) of the random variable X. The 
collection of pairs (*,,/>(*<)), /= 1,2,... is sometimes called the 
probability distribution of A. 

The probability distribution function F(x) t known also as the 
cumulative distribution function (CDF), is defined as 

F(x)= 2 p(Xi). 

Xi^X 

For example, the throw of an honest coin until a head appears 
is a random experiment. The sample space of this experiment is 
a discrete space. If X corresponds to the event of the appearance 
of the first head on the Arth throw, then X assumes the following 
values. 

[*1—ll>2,3,...A'...] 

The probability function p(x) and the CDF are 

ill 11 



Frequency Distribution and Measures of Location 


17 


...( 2 ) 

These functions are plotted in the adjoining figures. 


o-i 

o-\ 

o- 

c 

- "T 

Probability function associated CDF associated with Eqn. (2) 

with Equation (1) 

Ex. A radio tube is inserted into a socket and tested. The 
probability that it tests positive is f and hence the probability that 
it tests negative is The testing continues until the first positive 
tube appeals. Define the random variable X and determine the 

probability distribution of X. Also evaluate P(A) where the event 
A is defined as {The experiment ends after an even number of 
repetitions }. 

Sol. The sample space associated with the experiment is 

5 ={+» —h-h-f 

and X=[\,2, 3,4, 

X=n if and only if the first (/i-l) tubes are negative and the 
nth tube is positive. If we suppose that the condition of one tube 
does not affect the condition of another, we may write 

p(n) = P{X=n)=(l) r '- 1 (-*), n =l ,2. 

We can check that 





1+ \- + tr + - 



2 p{2n) = l - . L + 

n-1 






1 +j» + * + -) 

I _1 
i-io 5 * 


• • • 







Mathematical Statistic& 


2 9. Coufinuous Probability Functions and Distributions. 

A random variable X is said to be of the continuous type if 
there exists a non-negative function / (.*) such that for every rea& 
number x the following relation holds : 

F(x)= f / (x) dx. 

J -CO 

F(x) is called the distribution function of X. The function 1 
f (x) is called the probability density or the density function of the; 
random variable X. 

Every density function / (.v) satisfies the refation 


“OC 


f(x) dx=F(+oo)—F (— ©o) = l. 


Tf/(.v) is a real-valued integrable function, then F(x) is said 

to be absolutely continuous. It is known then almost every where 

, dF(x) 

Moreover, for every real a and b, where n<b, we have 
P(a<X^.b)=F /.;-/■(„) = !* f^) dx. 

fix' 1 is known as the probability density function (PDF). 

Since the probability distribution function is a non-decreasing 

monotonic function, the density function will be non-negati\e over 
the real axis : i e. f (x)'^0. 

If F(x) is a continuous function about x=a, the probability 
of A assuming the value X--a is zero. In fact 

F{X=a)= ™ P{a-«X<,a} 

or P {*=<,) = (a-*i]=0 

1 • V* 

For continuous random variable the probability of the r u - 
dom variable being in an interval decreases with the length of that 
interval and in the limit becomes zero. 

The function F(x) is non decreasing. That is, if x^x, we 
have F(Xi)%F(x 2 ) 2 ’ 

This we show, if we define two events A and B as follows : 

/l -='A ■Ct 1 }, P = Then, since AQB and by 

probability concept P( which is the required result. 



Frequency Distribution and Measures of Location 


19 


Theorem 1. If F (x) ii' the cdf (cumulative distribution func¬ 
tion) with pdf /(x), then 

f(x) = d Tx F (,) 

for all x at which F(x) is differer.tiable. 

Proof. F(x) =/ > (A r <x)= f / (x) dx 

J -CO 

Applying the fundamental theorem of the Calculus, we obtain, 

F' (A-)=/(x). 

Theorem 2. Let X be a random variable with possible values 
Xi,x 2 ,...and suppose that these values have been arranged such that 
xi<x 2 <... . Let F(x) be the cdf of X. Then 

p(xj) = P{X—X)) = F( Xi) — F(x f - 1 ). 

Proof. Since we have assumed Xi<* 2 < we have 
F(x,) = PiX—x } \J X=xi-i\J... U X=x{) 

[by the addition theorem of probability] 

And 

F(x t _ i )=P{X—xu iU X=Xi_i\J...\J X=x t } 
=pU-\HpU- 2)+...+ p (\) 

Hence F(xj) — F(xj- l )=P(X^Xj)=p(xj). 

Remark. When we speak of the cumulative distribution 
function, or sometimes just the distribution function , we always 
mean F(x) = P(A r <x). 

Ex. 1. (Cauchy’s Distribution). A needle spins about the 
point (0, a) of the (x, y) plane with b>0 , and comes to a stop, 
thereby determining an angle 0 (See Fig. below). The direction of 
the needle then intersects the x-axis at a point (X, 0). Assuming 
that 0 is a random variable with uniform probability distribution on 
the interval ( —tx f2 , n\2), what is the probability distribution of the 
random variable X ? 

Proof: Here 0 is assumed to y 
be a continuous random variable \l (c,a) 
with the probability density 

/(»)= f-i for |0|<W2 
(_ 0 for \0\>v.l2 

and that A' is a known function of 0 

X=a tan 0 

we have therefore for any x x <z,x 2 




Mathematical Statistics 


Iff 


j 


1 -^-<0<tan 


tan -1 xja 

In particular the d-f of x is : 

f (*)=P<X<*) = -Utan- —-(an- 1 

M a of 

freplacing x t by x and x, by — od } 

»ij tan -^ +W2 f 

and the probability c’ensity 

d F(x) 1 a 


/(*) = 


dx 7T * a- + X* 

The probabilty distribution given by F(x) or f (x) is known? 
as Cauchy's distribution. 

Ex. 2. (Normal Prababilfty distribution) : Enumerate the 
conditions for the continuous probability distribution. 

Show that when the function 

/(. V )=<r ( *~‘ j)2 '' 2c * 

will be a probability density function ? 

Definition : A probability distribution (S t B, P) is called 
continuous when S is a Euclidean space 7? n , B a Borel fifId con¬ 
taining all open and closed intervals in X n> and P is defined as 
follows. 

A function f (x u x 2 ..v n )=/(x) 

with the following properties ,s c.+lled a probability densit- 
function. * 

(0 /(v)>0 for all xC.Ru 
{* 

(>>) ^/(x)rfxe?.* -i for every A E B 

(hi) j d (s) dx~ I 

and the probability measure P is defined as 

j A f (x) dx for every A£ B 

The integrals in (n) and (iii) are w-dimensional integrals, and 
dX denotes the /i-difnensional volume element. 

Now fM^e~ {x ~“ yj,2a \ xejini 

where a is a given real number and .« a given positive real number. 



Frequency Distribution and Measures of Location 


Clearly 0</ (x)<l for all real x. 
•Hence f (x) satisfies (i), and 


| ^/(*) ^v<J^ e0 /(x)^ for/1EB 

50 that (ri) is satisfied if we can show thatj*^^ / (x) <7* exists. 

To see this we first evaluate the special definite integral. 

C x u 2 , 

I=s I e" du 

J oc 


we observe that 


I 2 = 


u * du jZ 


-V 2 

e av 


- rs~ 


-(" 2 -fv 2 ) 


du dv 


&nd changing to polar coordinates obtain 

/2 = L L e ' rJ r dr d0= L H e " 2 U) rf9 

! 2» 

\dQ=n 


hence 


OC _.,2 

e du^y/tt 

— oc 


—u 2 

e du=^o^/{2r.) 


we compute the normalising constant for / ( x) 

C* v , f 00 —(x—a) 2 /2o 2 . 

•I / (x)dx= e " dx 

J — OC J OQ 

I OC _ W 2 

e du=o s /(2 n) 

-- OO 

and obtain the probability density. 

?(x) _L. e -(x-o)’/2°‘ 

nxj av/(2 n) 

The numbers a and a are called the mean and the standard 
deviation of the random variable X. 

The probability density depends on two constants, a and <r 2 , 
•and to reflect this dependence we sometimes use the notation 

-m 1 -(*-a) 72 o* 

?( *> o ) “?7(2^) e 

Ex. 3. Let X have the normal distribution 

f(x) = ^L- e -(*-a)'l2o 2 

and let Y=\X—a\. Find the probability density of Y. 



22 


Mathematical Statisiics 


Sol. For >•>0 we have : 

F(y) = P{Y^y) = P( \X-a' <>-) 

= P(j-v<.V<a-f>’) 

tf-by —{x—a) z l2cr 


_ I (a-f. 


</.v = 


-1’ 

‘•'Jo 


■y ^V(2 

lfy<0, this event for .V is the empty set, for \X—a 
be negative, and F{y )=0 

0 fory<0 




can not 


By putting/(;) = 


O / /% i) 

e“ r/2a ' for ;■>(> 


.G\//7T 

we have F(y) = P( F<y) = [*' f ( v ) j y 

J- O 

For al! real values of y, f(y) is the probability density function 
and F (y) the probability distribution function. 

2 10. Averages or measures of central tendency. 

According to Croxoton and Cowden. an average is a single 
value within the range of the data that is used to represent all 
the values of a statistical series. It is sometimes called a measure 
of central tendency because it is somewhere within the range of 
the data. In fact, individual values of the variable cluster round 
it. Equivalently, averages give an idea about the concentration of 
the values in the central part ol the c’i-tribution. In the true sense 

an average of a statistical series is the value of the variable which 
represents the entire distribution. 

Theri are three groups of measures of location in common 


use 


(a) the Means (arithmetic, geometric and harmonic) 

(b) the Median and 

(c) the Mode. 

2 11. Requires for an ideal measure of central tendency 
The chid characteristics of an ideal measure of central' ten- 
deney suggested by Prof. Yule are as follows : 

1. It should be rigidly defined 

It Should be ea.ily understandable and easy to calculate. 
It should be based on all the observations. 

It should be suitable lor algelraic treatment. 


O 
*— • 

3. 

4. 

5. 

6. 


It should he least a fleeted hy fluctuations of sampling. 
It should no, he affected much hy creme values 



Frequency Distribution and Measures of 'Location ’2$ 

2 12. Arithmetic Mean. 

The arithmetic mean is the most generally used statistical 
measure, and in fact is far older than the science of statistics itself, 
if the random variate X takes on the values x u x 2 , x 3 , ...» x* 
then the arithmetic mean, denoted by x, is defined by 

_ x, + * 2 4- 1 ” 

x=— - =— 27 -v, 

n n 

If the value a*-; occurs with frequency f, where the subscript 
a takes the positive intagral values 1, 2,., n, then 




v_ fi 2 “f” • •• _ 1 v v 

*- — - -V S/ ‘ ‘ 




'’where N=f • •.+/„ = 27/» is the total frequency. 

■Iwhich is the relative frequency of a,, then 


x-T, Pi Xi 

<-1 

where 27 pi = 27 ~ = l 

* * iy 

We interpret x as the expected value of x and sometimes denote it 
•by E (X) also. 

Note : The arithmetic mean of a grouped frequency distribut¬ 
ion is calculated from the above formula alter the classes have been 
replaced by their mid-values. 

For the sake of ease in calculation a suitable number is chosen 
and deviations of the variate from that number are found. If a 
4s the assumed mean and let ^ { =xi—a for 1 = 1, 2, 3, n t then 

_ 1 n n 

jrZfiti, where N=2 7 /< 

i-l * -1 

~h& ft <*-•» 


= 1 / ,§/< *- a 

— x—a 

therefore x— 




24 


Mathematical Statistics 


If in a grouped frequency distribution the class intervals art 
of equal width h say, then we work with the equation. 

\ where \ is the new variate. 

or in particular x t =a+h & 

fiX,=~Ef (a+h y 

N <-1 N 1-1 

or x=a+h 

Sometimes the first moment of the variates X about any arbp* 
trary point a is denoted by 

1 n 

fi (xt—a) 

So iii is actually the mean measured from the point x=a. 

Clearly t i l '^~x—a or x=a+i if. 

For the continuous frequency distribution defined by 
dF=f (x) dx, — oo^.x<oo 


the mean x 


f xf{x)dx 

T. J-00 


-QO 

00 


—__-os-- — abscissa of the centre of gravity 

I f(x) dx °f the area °f the frequency 
J _oo ' curve above ox. 

In case/(.v) dx is the proportional frequency for the interval 
(*+1 dx, x—\ dx) or (.v, x-j-dx), then 

TOO 

f(x)dx=l 

J -DO 

and the above formula simplifies to 

- roo 

X = I X f (x) dx. 

J-oo 

In the distribution given by dF=f (x) dx, — oo^. Y <oo 
y=f (a) is the frequency curve and 

F(*)=[ f {x) dx is the distribution function 

J —CO 

If the actual frequencies are g{x), totalling N, we have the 
first moment about x=a as 

1 f x 

Pi C= ^J_ (x-a) g{x)dx 


and the mean x^a-t-nf 



Frequency Distribution and Measures of Location 


25 


212. Properties of the arithmetic mean. 

1. The sum of the deviations of the values x, from tlte,r 


mean x is zero 


For S fi(xt-x)=S fiXi-xZfi 

i 1 


=Nx-xN =0 

2. //*, * 2 . x t are 'he means of k distributions wi'h res - 

pective frequencies n t n 2 .... m. then x, the mean of'he whole dist- 
ribution is given by 

StuZ, where #=/»«, the total frequency. 

Let xl occur with frequencywhere x„ is the/* observation 
in the i th distribution. It is given 


Z fu="i 

j 


i 


1 


and Xi=Z fa xuJSfii-— 2 fa x “ 


...(D 


...( 2 ) 


Hi i 


Now x=Z S fa XalZ 2 fa • 

< j 1 * 

=Z mXilZni from (1) and (2) above. 


— Z niXi, 
N i 


...(3) 


214. Merits, Demerits and Uses of Arithmetic Mean. 

Merits. 1. It is rigidly defined. 

2. It is easy to be understandable and easy to compute. 

3. It depends upon all the observations. 

4. It is suitable for algebraic treatment. 

The arilhmetic mean of the composite series in terms of the 

means and sizes of the component series is obtained by using the 

formula : 


n _ » 

x= z tuXi/z m. 



26 


Mathematical Statistics 


5. It is called a stable average in consequence of the fact 
that it is least affected by fluctuit'ons of sampling. Hence all 
reqti sites for an ideal measure of central tendency suggested by 
Prof Yule are met in case of arithmetic mean. 

Demerits 1. Extreme values affect arithmetic mean to a 
large extent. Thus arithmetic mean depicts a distorted picture of 
the distribution, and its claim of remaining- representative of the 
distribution is then forfeited. 

2 Arithmetic mean may lead to wrong conclusions, if v»c are 
not provided details of the data from which it is computed. 

For clarificatfon of this point, \v-» consider the following 
marlcs obtaine I by two students A and B in three tests respectively 


Marks in 

I Test 

II Test 

III Test 

Average marks 

A 

40% 

<■ o% 

80% 

60% 

B 

*0% 

60% 

40% 

u *Vo 


Now we see that the average marks obtained by each of the 
tw > students in three tests arc 60%. O i the basis of the average 
m rks. one may conclude that the level of intelligence of both the 
.‘.indents at (he end of the year is the same. But we lnve a 

fallacious conclusion at hand from the data that A has improved 

hnn ; ?!f consistently whereas Li has deteriorated consistently. 

3. (.alcui.iti >n oi arithmetic mean becomes rather impossible 
tn c rw extreme class is open, e g., below 10 and above 80. Mean 
can i ot be computed even when a v le observation is missing 

O 

4^ 9 - rt has unlimited uses in our daily life. [t is generally 
U‘C'1 to study the financial and social problems. All kinds of 
problems of import, exp art, income, exncnditure, production, con- 
sumpii m etc can be tackled by arithmetic mean. 

lv<. 1. Find the mean of the fn gu acy distribution : 


Score v 

(A) 




9 




6 


Frequency 1 1 5 A f> o 



Frequency Distribution and Measures of Location 


27 


Sol. 


Scores 

Frequency 

/ 

/X A- 

12 

1 

12 

11 

1 

1 ) 

10 

5 

50 

9 

4 

3 b 

8 

6 

48 

7 

2 

14 

6 

1 

6 


1! 

O 

27 fx =177 


Here 


N= 20, 


SA=177 


Mcan = 


177 
20 
8 85 


Ex. 2. Compute the arithmetic mean by short-cut method from 
the following frequency series : 


Class intervals 1 — 5 6 — 10 11—15 16 — 20 21 25 

Frequency 3 17 20 8 2 


Sol. Let the assumed mean be 1*3. 


Class interval 

Mid Value 

X 

Frequency 

f 

Deviation from 
a i.e. a — 13=2 

f ir 

1- 

a 

3 

3 

-10 

-30 

6- 
11 - 

10 

15 

8 

1 

17 

20 

- 5 

0 


16- 

20 

18 

8 

5 

<0 

21 - 

• 

25 

23 

2 

10 

20 

Total 


50 


-55 


13 ut 


x = a- f- 


27 fl 


= 13 + 


27 Jt 
-55 


50 

— 13 — 11 = 1 I ‘ 9. 

Thus Jl‘9 is the requested aiithmctic mean. 


\ 





28 


Mathematical Statistics 


Ex. 3. Find the mean for the following frequency dlstsibution : 

Years under 10 20 30 40 50 60 

No. of persons 15 32 51 78 97 109 


Sol. We arrange the data in the tabular form. Let the 
assumed mean be 35. 


Years 

Number of 
persons 

Mid-values 

X 

- x—35 

5 - 10 

n 

0-10 

15 

5 

-3 

-45 

10-20 

17 

15 

-2 

-34 

20-30 

19 

25 

-1 

-19 

30-40 

27 

35 

0 


40-50 

19 

45 

1 

19 

50-60 

12 

55 

2 

24 


109 



-55 


* 109' 55 


and 




35 + ^lOx =29 95 years approximately. 


Ex. 4. Find the mean for the normal distribution 

. r - -b‘(x— a) n - 
dF=C e dx, —cc<x<cc, 

Sol. We first determine the constant C from the relation 

“ c.-* <*->■*-, 

-QO 


i. 


hence 


We make the substitution 

y = b {x—a) and refore dy—bdx 

C 


or 


~T~ vA = 1 i.c. C= b 


y/ ft 


Therefore we write 


. b —b z (x-n) a , 
dF= e ax, — co<.v<oo 


and 


x 


f 00 b 

J-co 


b —b 2 (.y — a) 


dx 





Frequency Distribution and Measures of Location 


2y 


The substitution y=*b (x—a) gives 


_ 1 
* = v 


-.JIM 


=ir 

Vn J- 


CO _ y 2 

a e dy 

00 




Since the limits are symmetrical and 

*> 

— ^ is an odd function. 

b 


=a. 


Ex. 5. F/W Me o/ the Beta distribution 

1 


</F= 


(1—x) m_1 x n-1 dx. 0<x<l 


B ( m t n) 

Sol. We observe that 

—1— f (l-x)"" 1 x"" 1 dx= \ 
B {m,n) J e 


and hence 


x=—±—C x{(l-x)»-‘ x" -1 } dx 
B (/«,/!) I 0 


c=_L_ f 1 (1 —x) m ” 1 x”</.v 
B (m,n) Jo 

_B (m, n— 1 )__r (m ) F{(;i+1)) F {(m + n)} 

“ F (m,n) r {('«+'*+!)} * r l m ) r W 

= —— [remember F {(«+ 1)} = nF ('03 
m +« 

215. The Geometric mean and the Harmonic mean. 

The geometric mean for n non-zero and positive \allies 

*i,Xa,...,Xn is defined by 

G=(xiX a ...x n ) 1/n 


or 


log G— E log Xi. 

Is <-i 

If we assume that the value x< is repeated with frequency f 


then 


G=^ n Xifi ^ where jV=Total frequency = f 


log fi log Xi. 


or 



30 


Mathematical Statistics 


Again if we let fJN—Pt , then 

n " 

log G= y, pi log Xi, where 2 ^ 1 

For the continuous frequency distribution 

dF=f (a) </a\ — 00 


OO 


logG=j /(a) log A r/.Y^| f (x) (lx 


If 


CO 


-cc 


/(a) Jr = 1, then 


f 50 

log G = l / a) log a dx. 

J-CO 


The harmonic mean of n variate values xt with frequencies /< 
is the reciprocal of the arithmetic mean of their reciprocals. It is 
given by 

f,lx„ where A’= 2 /, 

// J\ 1 1 i° 1 


II 

For proportional frequencies i.e. when fJN 

1 n 

=27 /»#/ v* 

Jl i~j 


=Pi 


For the continuous frequency distribution. 

(IF— f (A) dx, -co<v< + co 

1 

// 

If | /( ) r/a = I i.c*. when /(a) is the probability oi 

the variate a falling in the interval (x — l dx, A + i dx); 


r \+ f i - - i 


1 

// 


J - co A 


The geometric and the harmonic means are used only in ele¬ 
mentary statistic^; they are not important in advanced theory. 

2 J6 Merits, Demerits and uses of Geometric Mean. 

Merits. (1) It is rigidly defined. 

2. It is based on all the observations. 

3. It is amenable to algebraic treatment. 

4. It is not aff ctcd to a greater extent by fluctuations of 
sampling. 

5. It provides cc nip. r; lively more weight to small items 


Frequency Distribution and Measures of Location 


31 


Demerits. 1. It is not easily understandable anJ is not easy 
to find for a non-mathematics student. 

2. Geometric mean vanishes if any one of the observations 
is zero, and it becames imaginary (irrespective of the magnitude 
of the other items) for any one of the observations is negative. 

3. It can not be determined for open intervals. 

Uses. It is generally used in determining the rate of popul¬ 
ation-growth and rate of interest. It helps us in constructing 
index numbers. 


2 17. Merits, Demerits and uses of Harmonic mean. 

Merits 1. It is regidly defined. 

2. It is based on all the observations. 

3. It is amenable to algebraic treatment. 

4. It is affected much by fluctuations of sampling like geo¬ 
metric mean. 

5. It gives comperatively more weight to small items and io 
used only when small items with a very high weightage have to be 
given. 

Demerits : It is not eaihy understandable and is difficult to 
compute. 

Uses The uses of harmonic mean are limited- If is generally 
used to find the average speed and quantity prices. 

Ex. 1. Compute the geometric mean of the fill -wing series : 

Size : 8 10 12 14 16 18 

Frequency : 6 10 20 8 3 1 

Sol : 




Frequency 



log x { 


6 

10 

20 

8 

5 

1 


0 9031 

roooo 

1*0/92 
1 1461 

1-2041 

1-2553 



5 4186 

10 Out 10 
21-5840 
9 16 8 

6 '-205 
I 2553 



The geometric mean is defined by 


Z fi log xt 

t 



32 


Mathematical Statistics 


_ 53'4472 _i • has cm 

Now log G ——-—1 06894 

Then 6=11*71. , 

Ex 2 Compute the harmonic mean of the following data . 

Size : 60 80 150 160 200 

weight 3 10 2 4 

Sol. 


Size 

Xi 

frequency 

ft 

1 

— • 

Xi 

n 

Xi 

60 

O/t 


0 01667 

0*05001 

J 

in 

0 01250 

0 12500 

80 

150 

1 u 

7 

0 00667 

0 01334 


0 00625 

0 02500 

160 

200 

4 

1 

0 00500 

0 00500 

Total 

£/<= 20 

4 


0 21825 


1 he harmonic mean is defined by the formula . 

Zfiltt 

1 _ ‘_^0 21825 

n ~ Z fi 20 


which gives 


//= 


20 


— 91’ ~4 


0 2185 

Ex. 3. E/m/ //it* geometric and harmonic means of the dis : 
tribution. 

.x n ' 1 r/.v, (K*<1 


n) 

Sol. We have log 0=^7) f‘ (I-.x)"- *•-* log .v dx 

Since f (l-.v)™' 1 V -1 dx=B ( m, n) 

J 0 

we have on differentiating both sides with respect to n 

(This is permissible in view of the uniform convergence of the 
integral and the existence of the resulting expressions) : 

j 1 ( \-x) m ~ l a"" 1 log x dx=^jD(m t n) 

fn ,3 ,«-i 9 *°g * 

[Remember '=^e 

= — l0ii *.log x =x l 1 log *1 



Frequency Distribution and Measures of Location 


33 


Thus log G= 


I 


B(m, n) dn 


~B(m, n) 


=|-log B(m, n) 


d . 

~^ log 


r(m)f(«) 


r(m-\-n) 

=^j log /»—log r(m+n) j- 
The harmonic mean is given by 

_L = * f 1 

H x 

— 77—^— ( 1 (I-*)™' 1 x"-*dx 

_ B(m % n— 1 ) __ r(m)r(n— 1 ) f(m + n) 
B{m, n) r(m+n — i) ' r(m)T(n) 
raf n—i 


n-l 


so that H= 


n-l 


m -J- /i —1 

We may note that the arithmetic mean, ~~^~ n * s £ reatcr 
*he harmonic mean, for 

1 


”_^ m n— 1 

*n+n rw + «’m-f«—1 


m 


m+n — 1 


m 


> 


m 


®nd clearly - 

m + n — 1' m-fn 

Theorem 3 . For distributions in which the variate values are 

non-negative. 

D < G < A, 

the symbol A stands for the arithmetic mean. 

Let us consider the quantity 


1 


E (p)= 77 (*1*+*/ 

ft 


t/P 


where the x's are positive numbers. 

We know that E(p) is an increasing function of p, 
i,e ‘ BiPi)>E(p t ) if p,>/ 7 2 . 

Now when p «= 1, we have 


E U) aa -7-(x 1 +x 2 -f ...+x„)|-^ 

fl I 


When pa—1 



34 


Mathematical Statistics 


£( - 1)= ii(i + i7 + - + 


2 

X„ 


-1 


=H. 


When we have the geometric mean, for 

% log —2 (xp) 

Lim fog E{p) = Um n _ 

p-> 0 p-*0 p 

The expression on the right is of the indeterminate form 0/0 
and s^, applying L’ Hospital’s rule we have 
Lim log E (p)_Lim S log x 

p -*0 p —>0 2 

=—Z log x 
n 

= the geometric mean 

Therefore, in view of E being the non-decreasing function of 
p, obviously E(—1)^£(0) ^ £(l), 
that is, the inequality 

H < G K A 

stands proved. 

Exercises 

1. The yields of grain (x lbs' from 500 small plots are grouped| 
into classes with a common class-interval (0 2 lb) in the table 
below’, the values of x given being the mid-values of the class 
Show that the mean of the distribution is 3 95 lbs. 


r 


X 

/ 

X 

/ 

28 

4 

42 

60 

30 

15 

4’4 

59 

3-2 

20 

4'6 

35 

34 

47 

48 

10 

36 

63 

50 

8 

38 

7S 

5-2 

4 

4 0 

88 




2 . 

x : 

/: 


3. 

ing (able : 
Marks *. 

No of Students 


Find the arithmetic mean of the following distribution : 
12 3 <*567 

5 9 12 1? 14 10 6 

[ Ans: 4*09] 

Compute the arithmetic mean of the marks of the follow- 


0-10 

12 


10-20 

18 


20-30 

27 


30-40 

20 


40-50 
17 

[Ans : 28] 


Frequency Distribution and Measures of Location 


35 


4. Given the following frequency distribution 

mean. 



Monthly wages 

No of workers 

Monthly wages 

12*5-17 5 

2 

37*5-42 5 

17*5-22*5 

22 

42 5-4 *5 

22*5-27 5 

Id 

47 5-52 5 

27*5-32*5 

14 

52 5-57*5 

32*5-37*5 

3 



No of workers 
4 
6 
1 
1 


5. The frequency distribution below gives the cost of pro¬ 
duction of sugarcane in different holdings. Find the arithmetic 

mean. 

Class Frequency Class 

2-6 1 18 ~ 

6 - 9 22 - 

10- 21 26- 

14 47 30-34 

[Ans 


Frequency 

52 

36 

19 

3 

1921] 


f The monthly incomes of 10 families in a certain locality 
are given below : 

ABCDEFGHIJ Total 
85 70 15/ 70 500 20 45 290 40 36 1 36 

Find the arithmetic, geometric and harmonic means ot 
above incomes. Which one of the three averages represents the 

above figures the best ? Give reasons. ( ^ M A , 955) 

[Ans ; A. , 3 /. = 1^ 13.60 ^(2 = 63’59, 77=42 00; <7 is the best) 
7 Find the geometric mean of the following distribution: 
Marks: 0-10 *W0 20-30 30-40 40 50 

No of Students : 5 7 (Ans ; 25 . 64 marks] 

Find the harmonic mean of the following distribution. 

V ■ 10 12 16 42 25 3Q 


f- 


8 


12 


6 2 

A 16 45 approx]. 

9. Show that for a set of positive values of the variate the 
arithmetic, geometric and harmonic means are the special cases 
of the p th roots of the mean of the powers of the variates. 



36 


Mathemetical Statistics 


10. In a frequency table, the upper boundary of each inter¬ 
val has a constant ratio to the- lower boundary. Show that the: 
geometric mean, G, may be expressed by the formula. 

tog g^x.+^-z f,{t-n 

iv fi 


where x 0 is the logarithm of the mid-value of the first interval and 
Cis the logarithm. £*• S*- A S ra 67 ’ 725 

jHint r 

Class intervene 

*i— x r 


x t —x 3 


Xt-X e 


mid-vafae? 

*i + *» 

2 

2 

*3 + .T, 


frequency 

fi 


s 


A 


Xi Xt+? 


Xf-f X{ + J 


Ji 






N Total 


. .V 2 _X 3 _x< x i+} 

Given : — =--=... — e 

*1 x 2 x 2 x t 

t X } -I- V* 

^=Iog -— 


From (1), x 3 =.T,c f , .v a *=x,e M , x^x^, 


••• r 


x i+ i=Xi e u ~ ,>c , e*' 


X 0 


From (2), 2e*° — *j(l -f e°) => v r = —• 
Now log G=~S fi log —— Y~~ 


I x! ^ c +X!e tc 

N f /i,0g -2- 


-to 


-( 2 ) 








frequency Distsibution and Measures of Location 


37 


1 , f 2e x ° e 1c (\+ e -‘) 

N* ftU *{T+ 7 '-—r 


L> } 


= *-?/- log { e e V ~-»‘J 


2 ft {x„+(/-l) c} 

=x,-{~2 f, (if—1), Since 2 f=N j 


Ex. 11. Show that 


•'] fe 


OH (Xi 


1 " * 

-x m ) 2 \=Z Z 

J t ~ l j>i 


OH&J (Xi—Xi)* 


*where 2 oj^O and x m is the weighted mean of xfs with weights o><. 
1 

{Delhi B. Sc. 19681 

f Hint. x, = — Xl +f ) 2 x 2 ^ :™3* s ,jf {k ree variables are only 
’considered. We easily have 

MNS oji ^x,-x„ =1 /NX oHxf^lJN$ uhx, j 2 


J Y= t/ * 1 -f-a> 2 + o> £ 


■or N 




x f -x, 


J 2 =N Z UHXd-^Z OiiX^ 


•or 


^ w<J ^27 oj( —X»)”J =(a» 1 -i-tt> 2 + aJ3)(a) 1 X l 2 H-co2^£ 2 4-W S X 3 ) 8 

—‘(^iXi + Oi 2 X z 4 <x» 3 X 3 ) fe 

<=w X (o 2 [x^ 4x* 2 —2xix,]4cu,w 3 (xi 2 4^ 2 —2 x,x 8 ] 

4 Wjw 3 [*■*+*3 2 - 2x a Xj<] 

e= tOj<O a (X|—Xa) 2 4 cu i a, 3 (Xi — x«)" 4 w 2^3 (-*-2 — Xj) 2 

3 

—27 <*><«>* (x< — x/) 2 

'W 

3 < 

«=£ £ u> t iD) (x< —x>) 2 

v= j>/ 



38 


Mathematic al Statistic? 


The result can be generalized for n variablesj. 

Ex. 12. A variate takes values a, ar , ar i i ... t ar n ' 1 each with 

n _ r n) 

frequency unity. Show that the A.M., A, is „ (l — r) * ^ r 


Prove that 


. an (1— r) r" 1 

G, is and the H.M., H, is j_ r n 

AH=G* Prove also lhat A>G>H unless «=1 when all the 
three means coincide. ( Kan P ur B Sc. 1969} 

Ex. 13. The frequencies of values 0, 1, 2,.. n of a variable 
are given by 

q\ "Ctf'-'p , *C t q-V'-P* 

where p + q=\. Show that the mean is np. 

Ex. 14. Show that in finding the arithmetic mean of a set of 

readings on thermometer it does not matter whether we measure 
temperature in Centigrade or Fahrenheit, but that in finding the 

geometric mean it does matter which scale we use. 

5 [Agra M Se. 1972} 

[Hint. Let C<, i=l, 2 t ...n be readings in Centigrade. The 
relation between Centigrade and Fahrenheit is given by 

1=1,1,..., I, 

180 100’ 

or F,=32+§C, 


• • 

metic means. 


F—32 + |C\ where C and Jr' denote aritb- 


Geomctric mean is given by 

1 " 


log G=y S log G 


and also 


log F=i- i log F, = -- £ log (32+-JC,) 
n i n 


9*32+* log G] 

2 18. Median Quartiles. The median value is the value of 
the variate which divides the total frequency into two equal halves. 
For the continuous distribution with frequency function fix), the 

median p 0 is given by 

S roo r co 

r ° f ( a ) ilx— 1 fix) </x=l f(x) dx 

_0C J /*• J-cO 



Frequency Distribution and Measures of Location 



The ordinate x=p. e bisects the area bounded by the frequency 
curve y =/(x) and the x-axis. 

In case, the variate values a-e arranged in order of magnitude 
in ascending or descending order, the middle most value is the 
•median. If there are 2/j variate values, the median is usually taken 
as the arithmetic mean of the nth and the (n-fl)th values. For 
the grouped frequency diitribution, the median is given by the 

-formula 


Median=/-f 


iti 

f 


) 



where 1 is the lower limit of the class in which the median lies. 

/is the frequency of this class, 
h is the width of this class, 

C is the cumulative frequency upto and including the class 
preceding the class in which the median lies and N is the total 
frequency. 

The median is easily calculated and is not affected by the 


extreme values. 


Suppose the total number of observations being arranged in 
oruer of magnitude of the variate is divided into four equal parts 
{instead of two as in the case of the median) and if the dividing 
ordinates are x=./a,, x«/* 8 , x=p a , then /x 2 is the median, ^ is the 
first quartile, fi 3 is the third quartile. The quartiles from the 
frequency table are calculated by interpolation in the same way as 
the median. The formula for these assumes the form 


(?-c) 

Qt=l+~ y—' x/i, i— I, 2, 3 

where Q\ stands for the quartiles and the other 
usual meaning as above. 


symbols have the 


Suppose the total number of observations are divided into n 
equal parts by the dividing ordinate; x=x 1( x=x a ...,x=x„_„ we 
get quartiles , or partition values. 

If // = 2, we have the median 

n=4, we have the quartiles 
n =10, we have the deciles 
n= 100, we h ave the percentile*. 


40 


Mathematical Statistics 


For the grouped frequency distribution, (be deciles and per¬ 
centiles are given by the formulae : 

W-c) 

£>,= / + — j — t-xh, j= 1,2,...9 

(«r_c) 

— Xh, *=r, 2,...,99 

where the symbols on the right have the same meaning as in the 
definition of the median. 

For a given frequency function /f.r), the pth quartile is the 
value of ft given by r /(. x) dx =f(.x) d.x. 

J -oo n J -CO 

First quartile is given by 

L f(X) dxJ 4 N 


and third quartile by 


11 f(x)dxJ-N 


-CO 


where A r =total number of observations^ 
The seventh decile is given by 

H 7 f oo 

fW *-75 


/ (*) dx . 

-OO 


f ( v) dx. 


2 19. Merits, Demerits and Uses of Median. 

Merits. 1. It is rigidly defined. 

2. It is easily underst mdable and is easy to find. In some 
cases it may be located merely by inspection. 

3. It is unaffected by extreme values. 

4. Its calculation is possible for distributions with open-end 
classes. 

Demerits. I. Its calculation exactly for even number of 
of observations is impossible. It is estimated merely by taking the 
mean of the two middle terms. 

2. It is not based on all the observations. 

3. It can not be handled by algebraic treatment. 

4. Fluctuations of sampling affect it much in comparison 
with arithmetic mean. 

Uses. 1 It deals with qualitative data (regarding honesty, 
intelligence, etc. of people). They can not be measured quanti- 



Frequency Distribution and Measures of Location 41 

tatively but can still be arranged in ascendsng and descending 

order of magnitude. Thus we are led to determine the av B 

intelligence or average honesty among a group of P“P'=- 

2. It is also used for finding typical value in problems 

wages, wealth etc. 

Ex. Calculate the median, quartiles and 6th decile from 

following data : 

Marks less than 80 70 60 50 

No. of students 100 90 80 60 

Sol. We arrange the data as below : 


40 

-.1 


30 

20 


20 

13 


10 

5 


Marks group 
0^x<l0 
10-20 
20-30 
30-40 
40-50 
50-60 
60-70 
70-80 


Cum frequency 
5 

13 

20 
32 
60 
80 
90 
100 


Frequency 

5 

8 (13—5 = 8) 

7 (20-13 = 7) 
12 
28 
20 
10 
10 


TV =100 


Median=marks corresponding to the 
Observing the cum. frequency column we note 32<50 < . 

marks for the 50th student will lie in the group 40-50. 
Using the formula for the median 

(M 


median=/H y 


x/, = 40+ 5 ^xl0 


= 40 + - X 10=40 + 6-4 = 46 4 marks. 
28 


Lower quartile=marks corresponding to the Wth <>• 25th 
student. This lies in the group 30-40. 

n X 10=34 2 marks. 

lower quartile=Ci-3^+ lz 

3 N . . 

Upper quartile=marks corresponding to the T th . e. 

student. This lies in the group 50-60. 

♦;ia—— 50 + ——— x 10 = 575 marks. 

/. upper quartile = 0 3 -- ur 20 


42 


Mathematical Statistics 


Sixth decile = marks corresponding to the —Mh i.e. 60th stu¬ 


dent. This lies in the group 40 — 50. 


.*. Z> G =40 + 6 ° 32 -X 10=50 marks. 

2o 

2'20 Mode : The mode or the modal value is the value of 
the variate for which the frequency is greatest. If the frequency 
function is continuous and ditTcrcntiable it is the solution of 

/'(*) =j x /(.v)= 0, f°(x) =£-, /(*) < 0. 

If/' (v) = 0 and/"(*)> 0, we have the minimum value of the 
variate which is some times called an Anti-mode. 

For the grouped frequency distribution, the mode is computed 
by the formula : 

Mod e=/+/k:/> xA 

1 J m —J 1 —/ 2 

where / is the lower limit of the modal class (the class having 
maximum frequency), f m is the maxi urn frequency,/ and J' 2 are 
the frequenices of the clashes preceding and following the modal 
class and h is the width of the interval. From the formula it is 
obvious that the modal class is divided in the ratio 



2*21. Merits and Demerits and uses of Mode. 

Merits 1. It is easily understandable and is easy to find. In some 
cases we can locate mode like median by mere inspection. 

2. Extreme values have no effect at all on mode. 

3. Mode is easily located, in case the frequency distribution 
has class intervals of unequil magnitude if the modal class and 

the classes preceding and succeeding are of the same magnitude. 

No problem of any son is created by open-end classes in 
locating mode. 

Demerits 1. It is not well defined. It is not always possible 
to find a well defined mode. Sometimes we have distributions 
with two modes. Such distributions are hinwdal. If a distribution 

possesses more than two modes, it is said to be mul.imodal. 

2. It is not based on all the observations. 

3. It is not amenable to algebraic treatment. 

4. In compaiision with mean, it is affected to a large extent 
by fluctuations of sampling. 


43 


Frequency Distribution and Measures of Location 

Uses. Inspite of having many defects, mode plays a very 
important role in our daily life as well-as in industry. It acts a guide 
in business forecasting and in the manufacture of ready made 
garments, shoes etc. It also determines the popularity of manu¬ 
factured products. Its use also lies in seasonal forecasts. 

Ex. 1. Find the mode of the distribution g iven in Example of 
§219 above. 

Sol. We notice that the maximum frequency i.e. 28 occurs 
in the interval 40-50. 

Now /=40, /» = 10, /i = 12, / a =20, /m=2 > 

Hence mode=40 + ^g_ j x 10=46 7 markes. 

Ex. 2 Find the mode of the following frequency distribution. 
Size (x) : l 2 3 4 5 6 7 8 9 10 11 12 

Frequency (/): 8 13 20 28 40 45 37 3 • 25 >0 19 ll 

The distribution is regular because the frequencies are incre¬ 
asing steadily upto 45 and then are decreasing but the frequency 
i0 after 25 does not seem to be consistent with the distribution. 
Hence we can not claim that the mode is 10 since the maximum 
frequency is 50. Now we shall apply the method of grouping, as 
explained below, to locate mode. _ 


* £ l 

Frequency 

size (x) - 

(1) 

(2) 


(4' 

(5) 

(6) 

1 1 
2 

3 

8 ^ 
13 | 

20 1 

48 

33 } 

41 

! •' i 

88 

4 

28 i 

i 

68 1 

113 

J J 


5 

6 

7 

40 | 

45 j 
37 * 

l 85 

^ \ 

70 \ 

I 

82 

95 

} m j 

115 

8 

,33 . 

l 

58 } 



9 

10 

. 11 

12 

25 

50 

19 

11 

75 ' 

i 

} 30 1 

-1 

80 

S 108 

94 


In column (1) the frequences are the original ones. Column 
(2) is obtained by combining the frequencies two by two. I lien 




44 


Mathematical Statistics 


leaving the first frequency, the combination of the remaining freq¬ 
uencies two by two leads to find column (3). Column (2) is repeated 
if we combine the frequencies two by two after neglecting the first 
two frequencies. Hence combining frequencies three by three 
yields column (4). The combination of frequencies three by three 
after leaving the first and the first two frequencies produces column 
(5) and column (6) in turn. 


The maximum frequency in each column is given in black 
type. The following is the table to find mode. 


Column no 
(!) 

Maximum frequency 
(2) 

Value or combination of 
values of x showing maxi¬ 
mum frequency in (2). 

(3) 

(1) 

50 

10 

(2) 

85 

5.6 

(3) 

82 

6,7 

(4) 

113 

4,5,6 

(5) 

122 

5,6,7 

<f>) 

115 

6,7.8 


Examining the values in column (3) indicates that the value 
6 is repeated the maximum mumber of times. Hence we say that 

the value of the mode is 6 but not 10 which is infact an irregular 


Theorem 4. For a symmetrical distribution , the mean the 
median and the mode coincide. * 

Let/(.v) be the frequency function for a symmetrical distri¬ 
bution, the line of symmetry being taken as y axis. The frequenev 
curve .v =/ (x) is symmetrical about OY 


The arithmetic mean S=j°^ .v/(*'«/* j f “ f (x) ,, x 

Since/ ( — *)=/(*) for symmetry about GY 
xf (.v) is an odd function. 


30 


-CO 


x f (x) dx = 0. Hence .v = 0 


. Ap “ in S . ince °™ vides the area of >•=/(.») above O.r equally 
.. Number of observations from 0 to oo 

— Number of observations from -oo to 0- 
hence, median = 0. 

The frequency for v and are equal for all values of v due 
to symmetry about OY. aue 


Frequency Distribution and Measures of Location 


4 ? 


As x approaches zero, the equal ordinates of the curve coin¬ 
cide at .x = 0 with a maximum or a minimum whichever exists and 
hence if there is a mode, that must be at „x=0 
Fora moderately skew (asymmetrical) frequency distribution 
we have 


Mean —Mode=3 (Mean —Median). 

The normal distribution curve 

—frx 2 

y=c e , — oo<x<oo 

is symmetrical about y axis and the mean, mediam, mode are all 
zero. 

Remark. In elementary theory the median and the mode 
are frequently used as measures of location They are easily under¬ 
standable. The median is the middle value and the mode is the most 
popular-value and the median is more easily calculated than the 
mean in numerically specified distributions. The arithmetic mean 
has greater importance in advanced theory because it tends itself 
easily to mathemetical manipulation and possesses certain sampl¬ 
ing properties, but the median has someother compensating 
advantages viz, it is less dependent on the scale and the form of 
the frequency distribution than the mean, and it seems likely that 
it may find its use more in advanced theory than hitherto. 

EXERCISES 


1. Calculate the average speed of a car running at the rate 
of 15 miles per hour (m p.h) Curing the first 30 miles; at 20 m /> h. 
during the second 30 miles; and at 25 m p.h. during the third 3o 
miles. 

[Hint : Harmonic mean=l9 15 approx nt.p.h .] 

2. A man motors from A to D. A large part of the distance 

is uphill and he gets a mileage of only 10 miles per gallon of gaso¬ 
line. On the return trip he makes 15 miles per gallon. Find the 
harmonic mean of his mileage. Verify the fact that this is the 
proper average to use by assuming that the distance from A to B 
is 60 miles. (Agra M.Sc., 57] 




Hint. ^ = - °~ 4 ~' 6 =0-0833 => = miles P er 


gallon approx. 


= 12 5 miles per gallon 


] 



46 


Mathematical Statistics 


3. If the price of a given commodity for ten years are given 
in seers per rupee and for the next ten years are given in rupees 
per seer. What is the quickest way of funding the average price 
over twenty years and why ? 

[Hint. Prices may be stated in two different ways which are 
reciprocally related, the resulting arithmetic mean of the one being 
the harmonic mean of the other. The harmonic mean is always 

lower than the arithmetic mean], 

4. Find the median, lower and upper quartiles, 4th decile 

and 40th percentile for the following distribution. 

Marks No. of Students Marks No. of Students 

0-4 10 14-18 5 

4-3 12 18-20 8 

8-12 18 20-25 4 

12_14 7 25 and over 6 

[Ans. Median= I0'9, ^=6*5, Q 3 =18*25, D A =9 3, ^0=10] 

5. A distribution (*,,/,), / = !, 2,..., n is transformed into 

the distribution by the relation yu^axt+b, where a and b 

are constants. Show that the mean, median and mode of the new 
distribution are given in terms of the first distribution by the same 
transformation. [B.Sc. Agra 61, Aligarh 60] 

6. Show that for J-shaped distribution with the maximum 
frequency towards the lower values of the variable, the median 
is nearer to Q y (lower quartile) than to Q 3 (upper quartile) 

[Agra M.Sc. 54] 



3 


Measures of Dispersion, 
Skewness and KurtoSis. 
Moments of Frequency 

Distributions 


31: Measures of Dispersion. It is not only sufficient to 
know the measures of central tendency of distributions, but it is 
equally important to know how the variate values in the distribu¬ 
tions are clustered round or scattered away from the points ol 
central tendency. The spread or scatter of variates round the 
point of central tendency is known as dispersion. The answer to 
the question as to which of the cricket team A or B who ha\e 
scored equal runs in two matches is better lies in the consideration 
of the dispersion of runs round the average. That team, whose 
spread of runs about the mean is less, consists of consistently 
good players and is to be preferred. 

3 2. Characteristics of an ideal measure of Dispersion. 

The following are requisites for an ideal measure of disper¬ 
sion : — 

1. It should he rigidly defined. 

2. It should be readily understandable and easy to compute. 

3. It should be based on all the observations 

4 . It should be amenable to further mathematical treatment. 

5. It should be least affected by fluctuations of sampling. 
^3 3 . Measures of Di spersion. There exist several measures 

of dispersion to measure the degree of spread or scatter or 
variability in a distribution They mainly fall into three groups : 

(i) Measures of the distance between ceitain representative 
values, such as the range, the interquartile range or the interdecile 
range. 

(ii) Measures compiled from the deviations of variate values 
from some central value such as mean deviation about the mean, 
mean deviation about the median, and the standard deviation . 


48 


Mathematical Statistics 


(iii) Measure compiled from the deviations of variate values 
among themselves, such as the mean difference. 

3'4. Range and Interquartile differences. 

Range is the difference between the greatest and the least of 
the variate values. It tell us little as to how the variate values are 
distributed between the two extreme values Range is however a 
useful measure of dispersion in quality control. 

Interquartile range is the difference between the third and the 
first quartiles and is given by Q 3 -Q { . It includes in its range 
fifty percent of the observations Similarly the interdecile range 
is the difference between the nineth and the first deciles and 
includes in its range eighty percent of the observations. Both 
these measures are easily calculable and give some approximate 
idea of the “spread” of a distribution. These are generally used 
in elementary descriptive Statistics, but in advanced theory they 
find little place because of being difficult to handle mathematically 
in the theory of sampling. 

3 5. Mean deviations 

Mean deviation about any number a is defined by 

1 n 

aT % fi I y i—a I , where N=Z /, 

n * i-i 

Since Z f (.y, — .v)=0, where £ /,.v< 

the mean deviation about the mean is defined by 

jr ffi I I 

The mean deviation about the median p 4 is defined by 

A fi I a* ~i L t I 

For the continuous frequency distribution the mean deviation 
is 

I a— fV I/(a) dx 

“GO 

"here /V= ( .y/(.y) </.v and ( /*(.v) d.x — 1 

J-OO J -GO 

Si is defined to be a coefficient of dispersion. 

By the mean deviation we will usually understand mean devia¬ 
tion about the mean. 




Measures of Dispersion, Skewness and Kurtosis, Moments 


49 


Theorem 1 . Mean deviation is least when measured from 
the median . 

We show below that the sum of the absolute deviations from 
(he median is a minimum. 

I Proof. Suppose the values of the variate are arranged in 
ascending order as shown in the figure. 



X 1 X2 *3 Xm-l 0 Xm+ •*» 

Let the mean deviation about O —any point lying in the 
interval (x m , x M+l ) be A. 

Let the origin O be displa.ed to the left through a distance c 
till it is presumed to coincide with x,„. The effect of this shitting 
will be that that the deviations of X,, x 2 , . x n will all decrease by 
the amount c and the deviations of -Wj, x m+2 .•••,*» will all 
increase Ly the amount c. Hence the mean deviation about the 

new origin is given by 

= l_ {n _ 2m)c 

n u 

The new deviation is obviously less than A so long as (n — 2m) 
is negative i.e. m>n/2. 

Therefore if n = 2r, the mean deviation is constant for all 
origins in the interval (x r , Xr+i) and this value is the least; it 
n=2r+ I, the mean deviation is lowest when the origin coincides 
with the Xr+ith value. The mean deviation is therefore a minimum 
when deviations are measured from the median or, it the latter be 
not determinate, from an origin within the range in which it lies. 

II Proof. The mean deviation of Xi,> 2 ,...,x» about any 
number x is given by 

E I x-xi | 

N i 

Let U=Z\x-Xi\ 

Arrange x,,x 2 , . x „ so that x,<x t <x 3 ..<x n . If x>x n , then 
by increasing x each term I x-x, I increases and if x<x„ then 
also each term I *-x< 1 can be increased by decreasing x. 



50 


Mathematical Statistics 


Therefore in either case U can be increased indefinitely and would 
not have a minimum. 

Hence for a minimum xi<x<x n . 

Suppose we have the minimum when x lies between x r and 
Xf+i. Then 

U=1 7 I x-x, | 

i 

= (x — * 1 ) -1- (x — X a ) + •.. + (.V - Xr) 

"M-V—Ar+l) - ! - x n) 

(IU - 

and — =(1-J-I + .-to r terms) — (1 -f-l + ...to n—r terms) 
ax 

=r—(n — r)—2r—n 

Therefore U decreases so long 2r— n is negative i.e. r<nj2 
and increases for bigger values of r. 

By definition the value of* corresponding to r=nf 2 is the 
median of the series. 

The proof is complete. 

Ex. 1. Compute the mean deviation {from the mean) of the 
folio wing dist ributi on . 

Marks 0-10 10-20 20-30 30-40 40-50 

No. of Students 5 8 15 16 6 

In order to find the mean deviation from the mean, we shall 
calculate first the mean. 


Mid value 

X 

*-25 

S 10 

Frequency 

f 

X — A* 

f\x-x\ 

5 

_ 0 

5 

-22 

110 

15 

-1 

8 

-12 

96 

25 

0 

15 

_2 

30 

35 

1 

16 

8 

128 

45 

2 

6 

18 

1 u8 



— 

— 

--- 

Total 

r 1 ^ 

50 

10 

472 


2 f * 10 

Mean=n-f-// —r— =25-f - x 10 = 27 marks. 

2 / 50 


Then mean deviation^-— £ f |.v, — x| 

472 
50 


9 44 marks. 


Meaurses of Dispersion, Skewness and Kurtosis , Moments . 51 

Ex. 2. Show that 

fsf, !*,-*'= ifs S _fi- Z fx, ] 

L x { <x x t <x J 

(Lucknow 50, Gauhati 69) 

Sol. Mean Deviation about the mean x 


^ 2 ^ \ Xi ~ x \ 


-to 

-to 


2 /<(*-*<)+2 ft( Xi 

X t <z Xi>X 


/,(.r,-x)J 


= “ 2 Jt (x t -x)+ Z fi(Xi-x) 

L *<<* _ *,>* J 

Now Zf( Xi -x)=0=> Z f (Xt-x)-{- Z _f i (x i -x) = 0 

Xi<X *<>jc 

ie. Z _f i (x t -x) = ~Z fi(x~x) 
xt>x Xi <x 

Hence mean deviation above mean V ■’ 


Jf “ 2 Ji (Xi-x)-Z f (x t -x) j 

L *i<* Xi<? • J 


-jy 2 Jtixi-x) 

’ Xi<X 


N x 2 fi- 2 JiXi I '^1 " y 1 ■■ v • 

L Xi<X Xi<X J 

Exercises ' 

1. Find the mean deviation about the mean of the following 

set of observations. Also find the mean deviation about the 
median and compare 

2*, 36, 16, 58, 32,4, 15. 

2. Find the mean deviation from mean for the following 

data. 

Class : 0-6 6- 2 12-18 18-24 24-30 

Frequency: 8 10 12 9 4 

3. Compute the mean deviation from the following data. 
What light does it throw on the social conditions of the community ? 

The difference in age between husband and wife in a parti¬ 
cular community is given in the following table : 


r .*.l 


V- v 


■<$v 

v, 'j 





52 


Mathematical Statisics 


Difference in years 

Frequency 

Difference in years 

Frequency 

0-5 

44£ 

20-25 

109 

5-10 

705 

25-30 

52 

10-15 

507 

30-35 

16 

15-20 

281 

35-40 

14 



[Bombay B. 

Com. 36}! 

[Ans. 5 3 years 

apprrox.} 



4. Compute the mean deviation (from the mean) 

from the 

following table ; 




Wages in Rs. 


No of labourevs 


Above O 


685 


„ 10 


500 


„ 20 


423 


„ 30 


389 


„ 40 


209 


„ 50 


73 


„ 60 


50 


„ 70 


0 



[Madras B.Se. 62 } 

[Ads. Mean=2'9; Mean deviation=I6‘23] 

5. Calculate the semi-interquartile range i.e. $ {Qz — Qr} 
©f 63 students of statistics given below : 


Marks group 

0-10 
10-20 
20-30 
30-40 
40 — 50 
50-50 
60—70 
70 — 80 
80 — c 0 
90-100 


No of students 

5 

7 

10 

16 

11 

7 

3 

2 

2 

0 


6. Show that the mean deviation from the mean of the series 

a , a-\-d f a+ 2d,.. a+-2nd is 

j W(w-H ) 

d 2n+l * 

7. Show that mean deviation is least when measured from 

the median. [Agra 65, Delhi Hens. 58] 





Measures of Dispersion , Skewness and Kurtosis , Moments . 53 


3*6 Variance. Standard Deviation. 

The mean sqitare deviation of the variate values about any 
number a is defined by 


1 


2fi ( x i-a) 



• • • 



This is also called the second moment about x=a; the first 
‘moment is given by 

l 1. 

If a is the arithmetic mean of the variate values i.e. 

,a ~^ == ^f 2 ft x *> tBeB mean square deviation about x is de¬ 
fined by 

is called the variance. The positive square root of the vari¬ 
ance is called the standard deviation {s.d\ and is usually denoted 
by u, so that we have 

*=+vV* 

Thus we note that the variance is the mean of the squares of 
deviations of the variate values from the mean. 

The device of squaring the deviations and then taking the 
-square root of their arithmetic mean for obtaining the standard 
deviation looks artificial, but this technique makes the mathematics 
of sampling theory very much simpler than in the case with the 
•mean deviation. 

Quite often hi and \l 2 above are denoted by 
H\=E x—a), y. 2 =E(x—x) 2 

Now we show that is connected with and /*/ by the 
relation - 



2ft (Xi-aft* 

i 


We have 


•..( 3 ) 




54 


Mathematical Statistics 


= fi [(x,-x)+<x-a)l 2 

iV i 

Let xf—Xt—x, d=x—a 
Then 

v-*'~j r ?f ,{x, ' +df 

Zfi x ‘‘+jf? f ‘ x ‘'+d 2 j£- f ft 

=Vi+d\ since 27/, xi=^rEf t (x,-x)=0 

jN i Mi 

and 27 ft—N 

t 


Also 


d=x-a=— Z ftxt-a 


—jrf ft (*i—*)**?* 

Thus 

or ^ 2=^3 '—Hi 2 

In terms of s 2 t a 2 and d 2 , this relatfon becomes 

o 2 =s 2 -d* or s 2 =o*+d*. 

Clearly j 2 will be least when d =0 i.e. x—a=0 or a=x. 
Hence the root mean square deviations about any number is mini¬ 
mum when the number is the arithmetic mean. In other words 
we say that the standard deviation is_ the least possible root-meqn- 
square deviation. _ 

This result can be directly proved as follows : 

Let* i> *a>•••*?» be the observations, then the root-mean-square 
deviations about x is given by - 

/ (xt-x) 1 

/ n * 

Let u=E [x-x i )-‘={x-x^+{x-x,f + ...+{x-x,r- 

TheD Tx =2 [(*-*i)+(*-*»)+... + (Jt-Jr.)] 

=2 [(nx—2 x,] 


Measures of Dispersion, Skewness and Kurt os is. Moments 


55 


d 2 u 

dj 


= 2 «> 0 . 


Hence u is minimum when — = 0 i.e. 

dx 


nx—1 7 x <=0 

i 


or — X xi=x. Minimum w=»minimum s 2 . 

n i 

The formula 

is of great importance, and will be constantly employed. 

Note. The sormula (3; above can be put in the following 

form 

Pa=o 2 *=^- Zft [Xi-x) 2 

=JT f fi {x<-a) 2 -(x-a)* 


I (<> ' 


[r f A ^‘- a) ] 2 

=~ ffi ?.=- S A 5,y where ?,=*<-« 

-ji/iy-F -(4) 


’Farther, if /.«. 

h 

We have from (4) 


—=^ 7 - so that Z,i=hui 
h n 

y 



=/z 2 a M *. 


o x —ho M . 

where a B 2 and <j u 2 are the variances of the variates x 
'lively. Also u=*^-~ evidently gives x=a+/z«. 


and n respec- 




5 


Mathematical Statistics 


1 3‘7. Coefficient of Variation. 

The unit of the standard deviation will be the same as those 
of the variate. It is thus difficult to compare dispersions in 
different populations unless the units happen to be identical ; and 
so we feci the necessity of such measures which shall be indepen¬ 
dent of the variace scale, that is to say, shall be pure numbers. 
One such measure is Karl Pearson’s Coe fficient of variation, defined 

by ICO— i.e. lOOx (s.d.-fmean). 


Ex. 1. Find the mean and the variance for the distribution in 
which the values of x are the positive integers 

1, 2, 3 the frequency of each being unity. 

Sol. Here /t / = x=l/V (1 +2-f-3-K.. + AT) = * (W+1) 

The mean square deviation about .y = 0 is 



(l 2 + 2 2 +3--b 




N N+\){2N H) 
6 


Hence 


p2 " gd — I 1 1 ~ 

_ (V-H (2/V+n 
6 


-T,(VTl)] : = h(V 2 -l) 


Ex. 2. Find the mean deviation from the mean end standard 
deviation of the series a, a + d, a + 2d,...,a; -f 2/ d, and prove that 
the latter is greater than the former. 

[Punjab 57, Bangalore 62, Agra 63] 


Sol. The number of terms is 2n -fl. Taking the arbitrary 
origin at the (/;-}-1)lh term i c a+nd, we get 

.T i a s a -\-d, a-\-2d,... — a-\-nd, a-\-[n+\)d,...,a + 2nd. 

4 : — nd, — [n— \)d,..., —2d, — d,0 t d,2d,...(n — \)d, nd 


where $=x — {a + nd). Thus l=x—(a-\-nd) 


Obviously ^ = 0. Hence x=a-\-nd and 4 =.y—.y. 

1 


Mean deviation about the mean = ^- 

2/i-f-1 

2d 


Sixi-x = 


1 


=2„+i < ,+2+ - + ")= 


2n+l t 

n(n-\- 1) d 
2//+1 ’ 




- j \& 


Measures of Dispersion, Skewness and Kurtosis, Moments 


. 57 


S.D.>MD. 


jr +v 


) 


>'42±>1 = 5 . ( 2„ + I)2>3/ 1 (« + 1) 


j j 2 /j+I 

=>« 2 *’,-n+ 1>0=> [n-\-^) 2 -f-l>0 

whioh is true for every n 

Ex. 3. Scores of two golfers for 24 rounds were as follows : 

Golfer A: 74, 75,78,78, 72,77, 79, 78^ 81, 76. 72, 72, 

77, 74, 70, 78, 79, 80, 81, 74, 80, 75, 71', 73 

Go//<?r B : 86, 84, 80, 88, 8;', 85, 86, 82, 82, 79, 86, 80, 82, 

76, 86, 89, 87, 83, 80, 88, 86, 81, 84, 87. 

Find which golfer may be considered to be more consistent 

player ? ( A S ra M - Sc - l960 l 



70 

71 

72 

73 

74 

JX 

76 

I 77 

78 

79 

80 
81 


Total | 


£=x —a 
=x—75 

freql 

/ 

A 

A 2 

Golfer 

‘B’ 

Z’=y- b 
=>’ — 80 

freq 

r 

/T 

I' 4 2 

-5 

1 

-5 

25 

76 

-4 

i 

-4 

| 

16 

-4 

1 

-4 

16 

79 

-1 

i 

-1 

l 

-3 

3 

-9 

27 

80 

0 

3 

0 

0 

-2 

1 

-2 

4 

81 

I 

1 

1 

1 

— 1 

3 

-3 

3 

82 

2 

3 

6 

12 

0 

2 

0 

0 

8* 

3 

1 

3 

9 

1 

1 

1 

1 

84 

4 

2 

8 

32 

2 

2 

4 

8 

85 

5 

1 

5 

25 

3 

4 

12 

36 

86 

6 

5 

3( 

180 

4 

2 

8 

32 

87 

7 

2 

14 

98 

5 , 

2 

It 

1 50 

88 

8 

2 

16 

128 

6 

2 

12 


89 

9 

2 

18 

162 

u 

1 24 

1 24 

I3?4| 

1 

1 24 

1 96 

I 664 


Golfer *A’ 


t-h f' (24) 


= 1 


Hence c,=x-a gives x= 1+a = 1 +75 = 76 

f 1 = 10.4167 approx. 

=><7* = ?.23 ^ 

Coefficient of variation = 100 (o»/x) = —— = 4.25 






58 


Mathematical Statistics 


Golfer l B' 


f= F?/‘r<=^ (96) = 4 ^ 




Hence 


Z,' — y—b gives j’=4+80=84. 


ft 'ft~V= 


i (664) —16=^-=11.6667 




\ -r C ' =>^=3.416 

Coefficient of variation=100 (<jy/.y)=4.07. 


• • 


The coefficient of variation 4.07 for Golfer ‘B* is less than the 
coefficient of variation 4.25 for Golfer ‘A’. Hence Golfer ‘B’ is a 
more consistent player. 

Ex. 4. Show that , if the variable takes the values 0, /, 2 . n 

with frequencies proportional to the binomial coefficents ^ q) (l ) 

” ^respectively then the mean of the distributionis -- y the 

mean square deviation about x=0 is an£ j t j ie variance is — 

4 4 

[Agra 1972 Kanpur 1968] 


Solution : Here N=£f i = ^ + ^ + ^ "j 


+ ...+ 


= (1 + 1)"=2" 


c.)- 

s. (;i 


2 ft V< = 


tl 



-r.) +, -cK) + - 

7 H "T>--( £ 





( :h ) 

Thus 2 ft xi=n (1 + l) B ‘ 1 =n.2' , “ 1 


Mean x=4r s f 2 n ~ 1 = n 


N 


1” 


.*. the first moment is, (about x=0) = ~ 
now 




Measures of Dispersion , Skewness and Kurtosis, Moments 


59 


1 £ / n —1 \ 

/-I ) 


/ 


("-1 ) 

-f- [ <— 11 (": 2 2 )+?-.("-!)] 

”T"[ (n-,) (, + ,) -'+ ( i + i)"‘ l 
=-§; [(«-!) 2"- 2 +2»-*] 


O t =H , 2 — tl \ 2 


n (n- 4-1) n 2 

“ 4 


n 


4 4 4 

Ex. 5. In a series of measurements we obtain mi values of 
magnitude X\ t m a values of magnitude x 2i and so on. If x is the mean 
value of ail the measure meats , prove that the standard deviation is 


A 


Em, (k — x ,) : 


2 m. 


6 2 


) 


where x=k+8 and k is any constant 


(Agra 19641 


Solution : Here 

Z m r {x r — x) 2 


o 2 = 


2 m, 

T 


. 2 m r x r 

, where x~ 


2 m r 


2 m, { {x r —k) — (x—k) } 


2 m r 

r 

2 m r [x r —kf— (x—A) 2 2 m, 

t r 


2 m r 

r 


2 m (x-*) 2 

r r r 

2 m 

r r 


- 6 2 




60 


Mathematical Statistics 


or <s- 


rZ m T {k-x r ) 2 y/ a 

- 1 


Ex. 6. From a sample of observations the arithmetic mean 
sind variance are calculated. It is then found that one of the values 
k u is in error and should be replaced by xf. Show that the adjust¬ 
ment to the variance to correct this error is 


4 (*/-*,) (Yi’ + A, — ^'-~- T ‘ +2r) 


where T is the total of the original results. 

[Agra 1968J 

Sol. : Let x mean and a 2 be the variance of the observa 

tions. 


••• y X n 

•*' * <= T • (x *~ x > 


—Z xf-x* 
n i 


1 T- 

— 2 A> 2 - ~ 
n 4 n 2 


where T**Xi+x 9 +-.+> 


•••(!) 


when x, is replaced by x' u let the corrected variance be of and 


then 


• 2 = ]-(aV+^+...+V)-( V+Af,+ -.+r.\» 


<**-*’)- 


“ ° 3+ ,7 + l 

Hence of— o 2 = — (^' 1 2 -^ 1 S J+ — — ( fT''~~ x i~i~x , i) \ 8 

n n 2 \ n ) 

=~(x\-x 1 ){x l ’+x l )+ 4-(.v,’-.v 1 )(2r-.v 1 +x 1 ') 


— G“ H—r- 


= ^ Wi-xj Xl ’+ Xl - x -i^i+ 2T 
n n 


The expression on the right side has to be added to a 2 to 
obtain the correct variance. 

Theorem 2. For any discrete distribution standard deviation 
is not less than mean deviation from mean. [Agra 67, I 4 S 68] 


Measures of Dispersion , Skewness and Kurt os is, Moments... 


Kii 


Let the variate x t occur with frequency /t, where 
#= 1,2,3,_/i and N=Z f { 

i 

Recall that 

zfi (Xi-X)\ Xi 

and the mean deviation from the mean is given by 

jjZfi \xt-x\. 

We have to establish 

Fr?f‘' x ‘- 7xl 

If we put \Xi-x\ = z if then we have to prove the* 

or i-J7/,z, 2 -3 ! >0 

or l-T/i (z,-z) 2 >0 
M i 

The last result is always true being the sum of squares. 7hui> 

the statement stands proved. 

3 8. Mean and variance of Composite sets. 

Suppose we have p families with heights of md.v.duets as 

x„,x 12 , ••• » % x 

x m> x n> m 

••• ••• •'* •** 
x Ht X * 2 » Xi, > Xt n> 

■ a ■ • • • ••• 

• • • • • • •** 

X pi * X p 2 .. X P "P 

We wise to find the grand mean and s.d. of the whole set. 
The mean x t of the i th group is 

_ l "< ...(1) 

Xi=— 2 x t ) 

Hi J-l 

and s.d. o< is given by 




62 


Mathematical Statistics 


1 m 

<*»-- 27 (*#-*,)* 

y-i 


The grand mean 


1 9 m 

* = N ,1 £ *" 


where N=S n t 

<-i 


...( 2 ) 


1 5 _ 

or *=— 27 tii Xi from (1) 
iV {ml 

The variance a 2 of the whole set is given by 

o 2 =^ £ 2 (x t ,~x)* 

N { j 


...(?) 


..-(4) 


The root-mean square deviation Sj about x in the ith group is 
given by 

1 Ui 

* 2 =— 2(x„-x)* 

tii Jc*l 

1 m ni 

= S [(*<*- x<' + (*<-*)] 2 since 27 (*<*—x<)=0 

n i j-i j-1 




1 w< 

— 27 to—x<) 8 +(*<— x)* 

Hi jo i _ 


Kl 




o 

r\ 


*= -f d 2 { wher e) dj—Xj—x f 
Now using (5) the result (4) can be put in the form 


...(5) 


1 £ 


I * 


o==-- 27 w, **= - 2 «aW+^ 2 ) ... (6) 

iV «-i l-i ^ 

Ex. 1. The ^standard deviation of two sets containing n x and 
Wo members ar& a x and ^Respectively, being measured from their 
respective means nffand the two sets are grouped together 

as one set of (wf¥>/ a ) members, show that the standard deviation, 
a, of this set, measured from its mean is given by 

^ - a a!±aaP + ”^ {nti _ mt) * 

n x -\-n 2 («i+n 2 ) 2 

[Agra 68, Poona 67) 

Here x x =m lt x 2 =m 2 , i— 1,2 
lienee the grand mean from (3J about is 


Km 1 +w 8 wt 2 ) 





Measures of Dispersion, Skewness and Kurtosis, Moments 


63 


The variance c 2 of the combined set is given by from (6) above 


2 = ^r hn, («,*+</,') 

N <*=>1 



where N=n l +n 2t di=m x ~x, d % ~m 2 —x 
We can put (2) as 

_ Hi g i 2 ~bn 2 G 2 2 n 1 d J 2 -\-n 2 d 2 2 

n x -\-n 2 


From (1), 
Hence 


j_— '” 2 ) j_ 0 2 —w,) 

rfi=Wi — *=-;-, a 2 =m 2 —*=- 

"l + *2 2 2 n 1 ~t,l 2 




. g _ "igi 2 +» -<v + _ 

Mi + w 2 («1 -f- /» 2 ) 2 


(Wj—/Ms)*. 


Ex. 2. distribution consists of three components with fre¬ 
quencies 200, 250 and 300 having means of 25, 10 and 15 and stand¬ 
ard deviations of 3, 4 and 5 respectively. Show that the mean of the 
combined distribution is 16 and standard deviation is 7‘2 approxi¬ 
mately. Find also coefficient of variation (C.V.) [Agra 64] 

Here n x = 200, « 2 =250, n 3 =300, iV=ni+/j 2 -f-// 3 = 750 

*i=25. * 2 =10, *3=15 
oj = 3, g 2 = 4, g 3 =5 . 

The formula 


1 3 

x—-r r - S n t Xi gives 
is (“1 

*= _l_(200x 25 + 250 x 10+300X 15)=^j=16 

Further d x = x x — *=25 —16 = 9 
Similarly d 2 = — 6, d 3 = — \ 

The formula 


° 2 =4r 2 n i W+(h % ) gives 

Jv <»i 


=4[200 (9+81)4-250 (I6 + 36) + 300(25 + l) ] 


776 

15 


= 51 7334 


i.e. ct= 7‘2 approximately. 

C.V. = 100®-= 100x^=45. 
* 16 




64 


Matlu matical Statistics 


i Ex. 2. A frequency distribution is devided into two parts. The 

mean and standard deviation of the first part are m x and Si and 
■ those of the second part are m 3 and .v 2 Obtain the mean and stand- 
j ard deviation for the whole distribution. [Kuruk. 67, Lucknow 66] 

! Ex. 3. An analysis of monthly wages paid to 586 and 648 

* [ workers in two firms A and B, belonging to the same industry , 
reveals an average monthly wage of Rs, 52 5 and Rs 47 5 with 

variance of wage distribution Rs 100 and /fa 121 respecti ely■ 

(a) Which firm, A or B, pays cut the larger amount as monthly 

wages ? 

' (b) In which firm, A or B, is there greater variabiltiy in indivi - 

j 

j dual wages ? 

I (c) What are the measures of (i) average monthly wage and, 

(//) the variabdity in individval wages of all the workers in the two 
firms, A and B, taken together l jPunjab 61] 

The gi\en data can be expressed as : 

Firm A ; Firm B 

^ 1= =586, Xi = 52‘5, g 1 2 =100 ; ;; 2 = 4S, a\. = 47’5, c 2 '-= 121 

(a) Monthly wages paid Monthly wages paid by 

by firm A firm B 

/,jA' 1 *=586X52*5=Rs. 30765 m.v J = 64Sx47 5 = Rs. 30780 

Thus fiam B pays out larger amount for month. 

(b) C.V. in firm A C.V. in firm B 


100 —=100x 10/52’5 —19 

x i 


100 ^- = 100x 11/47-5 

-v 2 


23 16 


The firm B has greater variability in individual w.iges. 

(c) When the two firms are taken together 
(i) average monthly wages 


A* = 


ni y, + /» 2 v 2 _ 30765 + 30780 _ 6! 545 
+ 586 = 648 1234 

(ii) variability in individual wages 


= 49‘87 rupees. 


g* = -— E in [a- 1 — dr ', vv he re di =.Yj — x 

n i <-i 

d 2 =,y 2 - A 

_//jG|' 4- n%G<i m | U\H2 (-Y| -Vi)" 

(«i + w*) 2 

_ 586 x H 0 + 648 x 121 ,586x618 (52'5-47 5F 
586 + 648 (586+648) 

= 117 264 


Measures of Dispersion , Skewness and Kirtosis, A foments 


65 


=> c=10‘8I rupees approx. 

Ex. 4. Show that , if deviations a r ( ~x r — M) are small 
compared with the mean if, so that ( a/M ) 3 and higher powers of 
afM may be neglected 

(a) (/) G=M fl (/,) A/ 2 -C 2 =oMAgra 53, 61| 

(iii) //=A/(l-^- 2 ) (*v) 2G-M-H=0 [Delhi 71) 

(b) Mean (^x)=y/Af(l approx. 

where c the s d (fx. iDellii Hons. 66] 

Let x ll x 2 ,...x n be the observations. We are given 
E(x) = M, E(x— M)*= a 2 . Let x r —\f=a r be the deviation. 



G = [*!*•* ..x n ) l l* 

=[(lf+a,)(M+a t ) . (\f + a n ]>/« 

(■+»-)] 


l/n 



1 + 


M 


1 T" 

Z ° r + Af 2 E. Cra3 + '“ 

■r M f=fzs 


• • • 



Now M=— 2 * r = — T (a r + ;Vf)=i- + a r ] 

// r H r n * 


/• ^ — 0 - 

r 

and hence 2 af-\-22 a r a a = 0 



to 

l^ 

a T o,-= 

-2 aS= 

— no 2 




since 

a 2 

_ 1 _ 

n 

2 ( x r -M) 

2 = — 27 <7, 2 
n 

i.e. 

2 a 2 =no 1 


Hence G 

= M 

no 2 1 

L 2J\f* J 

1- 

a- 

’2A/2\ 

i 


• 

• • 

G 2 

= M 2 

r i-" 2 

[ 2 vi 2 

= 1 

a 2 

M 2 

j, proving 

(i) 

<ii) 

• 

4 

= M 2 

-a 2 => 

A/2-G 2 = 0 9 , 


proving 

(») 

(iii) 

1 

H 

_ 1 

n 

2 ' — 

f ^ r 






= - 2 {a r + M f l 




56 


Mathematical Statistics 


-»rf ['-£+£] 

-nH •♦&)-*('+S.) 

Alo K-i, ( 2-y-M ( ■-£,) + 
= //+Af => 2G—H—M=0, 


= H+M 


<b) Mean ( V '.v)=— Z y/x, 

n t 


proving (in}' 


proving (iv> 


-- Z(a r + M)*l* 

n r 

='— r(i+2r) ,p 

n r \ M} 

eeVJL V fi + L*'_L 

n T |_ 2 M 8 A/ 2 J 


"\ r '[ ■ Sr .] ■ 

. 4 -v 

3 9. Standard Variate. If A is a stochastic variable such that 
f:(.\ )= M and V(.\) i e. E(X— A/) 2 =e 2 , then the variate 


. 1 - 


X-M 


is called the standard variate. 

We show that £(//)=■ 0 and V(u) = 1 


[£=Expectation? 
Variance} 


Now 


u — E{u)~ 


1 v -v, - -1 f 

n i -1 ^ 


.£ Xi — nM j| 


= — [nx-nM] 

nn 



Measures of Dispersion, Skewness and Kurtosis, Moments.. 


67 



x< — M 


V(u)=E{u-uY = -'l(*- 

n t \ a 




- Z(x<-M)* 

n i 



Exercises 

1. Explain the term dispersion. Discuss the relative merits 
and demerits of various measeres of dispersion. [Delhi B Sc. 67] 

2. What is the difference between mean deviation and stand¬ 
ard deviation ? Show that the standard deviation is independent 
of change of origin and scale. 

3. Prove that the mean deviation about the m:an x of the 
variate x, the frequency of whose ith size xj is/*, is given by 


* s /,- s 


fiXi 


Xi<X Xt<X J 

[Luck. 50, Gaurhati 69] 

4. Calculate the mean and stand irn deviation of the follow¬ 
ing distribution : 


x: 2-5 —7*5 7*5-12*5 12 5—17 5 17*5-22*5 

/: 12 28 65 121 

* : 22* -27-5 27-5 — 32*5 32*5-37-5 37-5-42*5 

f: 175 198 176 120 

x: 42 5-47-5 47-5—52*5 52 5-57 5 57*5-62-5 

/ : 66 27 9 3 

(Banarcs 69) 
[Ans. mean= 30 005, S D=0 01] 

5. Explain clearly the ideas implied in using working orgin 
and scale for the calculation of the arithmetic mean and standard 
deviation of a frequency distribution. The values of the arithmetic 
mean and standard deviation of the following frequency distri¬ 
bution of a continuous variable derived from the analysis in the 

above manner are 40 604 lb and 192 lb respectively, 
x: -3 -2 -1 01234 

/: 3 15 45 57 50 36 25 9 

Determine the actual class intervals. I.S.I. (Dip ) 

6. For a frequency distribution of marks in Statistics of 200 
condidales (grouped in intervals 0—5, 5—10, ..., etc ), the mean 




6 # 


Mathematical Statistics* 


and standard deviation were found to be 40 and 15 respectively. 
Later it was d iscovered that the score 43 was misread as 53 in- 
obtaining the frequency distribution. Find the corrected mean and 
Standard deviation corresponding to the corrected frequency distri¬ 
bution. [Agra 66, Delhi 67, l.A.S. 57}’ 

[Ans. Mean = 39'95, S.D.= l4 974}‘ 

7. An analysis of monthly wages paid to the workers in two 
firms A and £ belonging to the same industry, gave the following; 

results: 

From A From B 

No of wage earnens 586 648 

Average monthly wages Rs 52 5 Rs 47'5 

Variance of the distribution of wages 100 121 

(a) Which firm, A or B pays out larger amount as monthly- 
Wages ? 

(b) In which firm, A or B, is there greater variability in in¬ 
dividual wages ? 

(c) What are the measures of (I) average monthly wages, and' 
(2) the variability in indivinual wages, of all the workers in the 
two firms A and B taken together. 

(Delhi 69, I A S. 51, Punjab 61) 

[Ans. (a) Firm B pa>s a larger amount as monthly wages. 

(b) There is greater variability in individual wages in firm; 
B. 

(c) (i) the combined arithmetic mean = Rs 49 87. 

(ii) the combined S D.= Rs 10 82.] 

3 10. Properties of (he variance of a Random variable. 

Let .V he a random variable The variance of X denoted by 
V(X) or " v" is given by 

V\\') = E[X- E(X)]- 

The evaluation of ! (V) may be simplified with the aid of the 
following result 

nx)=E(xr— [cx,r 

which corresponds to t : e formula /4 = F' a —h>' 2 
Proof. V{X) = E[X-EiX)] 2 

= E\X 2 -2XE(X)A[E{X)f] 

= E{X~)-2E{X)E X)-\ [E(X)) 2 

[Recall that £ (X) is constant] 

-£.J^M£'*)] 3 . 



Measures of Dispersion, Skewness and Kurtosis, Moments... 




Ex. 1. Show that the mea i (about the origin ) of the discont• 
*inuous distribution whose frequencies at 0, 1, 2,..., r,. are 


m 

/> lit /."W - p 771 _ 

* • * IT 2 ! * 


-« / ” r 
» e - f 


r i 


(is m, and that variance is also m 

Sol. Here s takes <he values 0, 1, 2,..., r, ... 


Then E(X)J% r.e~ m m * 

r-0 


/- ! 


m 2? M r ~ l 
e~ m .m £ 


r-r (r— D ! 

= e~ m tn e m 
=m. 
do 

0 


tr r 

#(* 3 ) = £ —r 

r-o rJ 


OO /V2 r 

=S {r(r-l) + r}e-« 

r -0 f 1 



2 


OO 


= e m Ki¬ 


rn 


r-2 


OO jjjr-1 

,?» f=2H +e ' m (/—I) ! 


— e m m 2 e m -\-e m m e m 
=m 2 -{-m 

•Hence V(X) = E(X 2 )-fE(X)]“ 

=w 2 -f-m — m 2 
— m. 

Ex. 2. Find the variance of the distribution. 

1 


dF 


(1-x)"'- 1 x'-'dx, 0<x<l. 


/») 

Sol. £(AT) -ft' - ^ I* -<* 

1 


£(**)=fV 


B(ni,n) 

n 

3^___ 

m-\-n 
i ri 


.fl(n+l, m) 


K-">-rS 


n) J o 

/a) 


x n+l ( J —x) 7 ” - ! dx 





70 


Mathematical Statistics 


;i(//+1) 

(/M + «)(lW + /I+ J ) 

^ n(rt+ 1) __ n- . 

(m+nH^+w+i) (w+/0* 

_ wtm_ 

(w+/!)-(/;* + // +1) 

Preperty I. If C. is a constant. 

V(X+ c) = V(X) . 

Proof. V{X-\- f) = £ff X+ c) - E(X+ c)f=E[ (X+ c ) —E(X)— cf 

=E[X-E(X)f=--\\X) 

Preperty II. If c is a constant , 

V(cX) = c 2 F(A') 

Proof. P(cX)=£(c.r) 2 -(£(^V)} 2 

= c 2 E(X 2 )—c s [EiXj) 2 
=c 2 [£(A'-) — {£(A')J‘J 
= c 2 KiA'). 

Property III. Consider the linear transformation applied to 
the outcome X : 

Y=a X-\-b 

Then V (Y)=a 2 V (X) 

Proof E(Y)=E (aX+b)=a E (X) 4- b 
Thus the new mean is obtained from the old mean by applying 
the same linear transformation. 

V{Y)=V(aX+b) 

~E[(aX+b)-E (aX+b) ] 3 

=>E[(*x-\-b)-a E (.r) —^] 2 
= E[aX-aE{X)Y 
=arE[X-E{x)Y 
=a 2 V (AT) 

Since V {aX-\-b)^a V (A') -f b 

the variance docs not possess the linear property as has been the 
case with the e.vpectators viz. 

E(aX-\-b)=a E (X) + b 

Property lY r . If (X, Y) is a two-dimensional random variable, 
and if X and y are in dependent then 

V (A'-f- Y) = V (A') + V ( Y) 

Proof. Let Z=X+Y 
Then E(Z) = E(X+Y) 

= E(X)+ E ( Y) 



Measures of Dispersion, Skewness and 'Kurtosis, Moments . 71 

The formal proof of this will be gives in the next chapter. 
However, let us not that 

® (*+y) = 'Z 2 (x t 4-y f ) pa ; where p< } = P ( x=x { y =y*) 

i i 


—2 Xi pn~\- 2 2 yj pfj 
i i ii 

= 2 A'< (2 pu)-\- 2 )’j (2 p**) 
t i i i 

=2 *</><+2 yt p, 

* i 

=E{x)+E(y). 

Now V (Z) = E [Z-E (Z)] s 

= E[x+y-E(x)-E(y))* 

= E [x-E (a)] 2 + 2 E {[x-E (a')] [y-E(y)} 

+ E (y—E (y)f 

Now x, y are independent, hence, we have 

E{[x-E ( a )] [y-E C y)[}=E[x-E ( a )] E[y-E(y)] 

= 0 

■therefore 

V (x+y) = V(x) + V(y) 

Thus, when we add independent random variables , 
we also add the means and add the variances 

•Cor. Let x u .x n be n independent variables. Then 

V{x x + .+*») = J'(x,)+ .H- V (x„) 

This follows from the above property by methmatical induc¬ 
tion. 

Example Compute the variance of a binomially distributed 
randum variable with parameter p. 

Sol. First, consider the simple binomial with a single trial 
•i.e n={. 

X P r (X) 

0 <7 where p+q= l 

1 P 

We obtain early 

E (x) = 0.q-f-l .p=p 
E (x 2 )=0 2 q+ 1 2 . p =p 
and therefore 

F(a) = £ v a 2 )-[£(*)]* 

^p-pt-pa 







72 Mathematical Statistics 

Suppose now that . x n ) is a sample of n from such a 
distribution. The random variable 

v=*i+•••+*„ is just the member of 1 in the sample, which 
is just the number of times the event “i” occurs. We now note 
that the x t are independent random variables since the value of a* 
depends only on the outcome of the i ih repetition, and the succes¬ 
sive repetitions a»e assumed to be independent. Hence we obtain, 

V (y)=V( Xl + „. + *J=K(jr 1 ) + ... + V(x n ) 

But V (a*) =E (Xi) 2 — \E (AT|)J 2 —pq for all i 

Thus V (x)=npq =np (l —p). 

Property V. TchebycliefFs inequality. 

We shall complete this section with an inequality called 
Tcheby cheiT’s in equality. This is a well known inequality due 
to the Russian mathematician Tchebychev. This inequality shows 
that the variance does to a certain extent control how far the 
probability can be spread out. For the random variable X with 
mean /x and variance a 2 the inquality is 

P{ | X—p I > A <r) < ~ 

where A is any positive number 
(or. equivalently, P r {lx-/x-< A <j}'^ 1-1/A-) 

It says that the probability farther ihan A standard deviation 
horn the mean must be less than 1/A 2 

I. Discrete case. 

Proof. Define the subset A of the 
outcome space S by 

^={.V< : | Xt—p | > a or 2 } 

= {*< : (a,-/x) 2 > A 2 a 2 } 

Now S=A(JA and Ar\A=<f> (null set) 

where A is the complement of the set A. 

Denoting by 2 summation over these values of i for which 

A 

x,£A, and similarly 2 and 2, we nave, 

a s 

ct2 =Sp< (.v«-/*) 2 
s 

= 2(*<-p) 2 + 2 Pi (Xi-fif- 

A A 

^ S Pi (*<-/x) 2 , Since 2 p, (x t - ^) 2 > 0, 



Measures of Dispersion , Skewness and Kurt os is, Moments . 


73 


>2 Pi A 2 o- 2 , by the definition of A. 

A 

=A 2 a 2 'Zp l 

A 

Since A 2 a 2 >0 we may divide both sides of the inequality by 
A 2 <x 2 : 


S/>, = P {X, : I AW I 

A 2 A 

The alternative form is easily obtained since 

P[X t : I A"W I °} + Pr {Xt : I AW I <W = I - 
II Continuous Case. 

Let/ (X) be the density for Y=ii-p, we have 

A 2 <7* P { | X- v- I >-W 

-AV /(>’( r/y+A L '<J 2 ( /(>’) 

J -CO J x<7 

= ( ** A 2 <r 2 / (y) dy-\- ( A 9 <> 2 f(y)dy 

J-oo 

Over the ranges of integration, we have A<r< I y |. 
Therefore, 

A 2 * 2 P { 1 x — P I 


< 


~\9 

-OO 

CO 


y*f(.v) «»+l r/OO 


y'fiy)dy=°* 

J-so 

Dividing both sides of this inequality by ,W- produces the 
required result. 

If A=2, 3, we find that the probability of exceeding 2<r-umts 
from the mean must be less than 0 250 and the probability of 
exceeding 3<x-units from the mean must be less than 0 111. 

Ex. 1. Show that for the exponential distribution 

dF=y 0 e-*l° dx, 0 <x<oo, <7>0 

y<> being a constant , the mean and the standard deviation are each 

equal to a and that the interquartile range is o log e 1. [I.A.S. 19 J 

Sol. dF (x) will be a probability distribution function 

r oo 

when 


CO 


dF(x)=\ )’o e x/<7 dx=l 


0 


£ ( Ao=j; 


a 


— e m *l* dx 

a 



74 


Mathematical Statistics 


E(X 2 ) = 


-H-”- 

■[— 

CO y 2 

— e~ xl ° dx 


r4 s 


OO 


or e~*l° dx 


0 

= o 


a 

oo 


a 2 


.y 2 dfy, if we put x/<*=y 


=<t-T3=2 <t 2 

=£(r-) = {£(A r )} 2 =2<7 2 - o* 
s.d.=< 7 . 


Also 


(2> L c-/» A=}=> 

Jo 17 


-1 + e - Q ‘ la =l 


Qi - dx=\^e Qii ° = i 

« a 


whence e ^ 1<J = 3=> O^-Q^g log, 3. 

Ex. 2. •SAaw /Aar/ar the “ rectangular ” distribution 

dF= dx, 0<x^l 

/V (about the origin) = \, /t 3 =} 7 , /wean deviation — (I.A.S. 1950} 
Sol. Since dF=f(x) dx,J'{x) = l, 0<x^l 
we have 


Jo 

f*a /= ( .x 2 */.v = $ 


0 


Hence Fa=/V—Hi' a =$-£=ii. 

Mean deviation (about the mean) 

f 1 f »/•* r 1 

= 1 \x-l\dx=\ (\-x)dx+\ (x-l) 


dx 


0 


0 


J i/a 


= 4 . 


3 11. Mean difference. Fora discrete frequency distribution 
the coefficient of mean difference is defined by 

1 


J 1= = 


A'(yV-i) 7 > 


Z’Z\x i -x,\ f { f it M 


or by ^i=-~r, 2 2 1 Y ‘“ v ' I f f* 

iV i j 

according as the zero differences are either excluded or included. 

The difference lies only in the divisor and is unimportant for 
large N. 


Measures of Dispersion , Skewness and Kurtosis, Moments 


75 


In the case of continuous frequency distribution it is given by 

i oo r co 

1 \ x—y \ f (x) f (y) dx dy. 

-oo J -co 

The mean difference is the average of the absolute differences 
of all possible pairs of variate values. The mean diffeience, 
which is due to Gini (1912), has a certain theoretical attraction 
in the fact that it is dependent on the spread of the variate-values 
among themselves and not on the derivation from some central 
value. It is, however, more difficult to compute than the standard 
deviation, and the appearance of the absolute values in its defini¬ 
tion makes it difficult to handle mathematically in the theory of 
sampling. 

The measure which shall be a pure number depending on the 
mean difference is Gini’s coefficient of concentration, defined by 

G=^d- U'=L £ fi Xi i.e. the mean] 

2pi |_ n i J 

We show below that the variance may in fact be defined as half 
the mean square of ail possible variate differences, that is to say , 
without reference to deviations from a central value , the mean. 

Let 

£ 2 =^£ £ (Xi-xtffif, hv=S/, = S/J 

N- <-i j-i L * 1 J 

=^ 2 f fi xc 2 Xi Zfi x,+f 2f, S f, xr 

^- 2 (sr V 1 ")(*■? f ‘*') + ^ V‘ 

=/*»'-2 .< in ')*+*** 

= 2 W -^ /2 ) = 2 ^ 

312. Moments. 

The n ,h moment about a is defined by 

I*.'=:f 2 fi (xt-ar. /. 

iV 4‘1 




76 


Mathematical Statistic* 


First moment about a is then 



fi(xt a) 


which is the same as the sum of moments about a of forces 
and second moment about a is 


i fi (Xi~a? 

Iy <a 1 

which is moment of inertia of the particles fufu—f n aXx u x tt >. t x n . 

First moment about zero is (1 /N) £/< x t =x (mean) 

The n th moment about the mean x is written without dash 
and is 

H» = (\/N) £/, (Xi-xr 
so that ^i = 0 and /z 2 =c 2 , the variance. 

The moments about the mean are of much importance because 
Cheir values are independent of the origin of reference. These are 
called central moments. 

313. Central Moments or simply moments about the mean 
in terms of moments about any point and conversely. 

By definition, 

1 « | n , 

/ 4 n=^i7 ft (x i —x, n = — ^ fi J (Xi — a)—[X—a) 

= 1} £ f< {X t —d) n , where Xi=x ( —a, d=x—a 
1\ 1 



= k [ ? * A ''"-(i ) A '<"'‘+(2 ) rf2 ?/< 

+(„'!,) " f /< x.+i-dr if, d * 

( j y d ^ n-i h i , )</ 2 l L 


Particularly, 


t L > = E (a* — 2 v.y -f x '■} 

= £(.y-)—F(. v)+.v 2 


...( 1 ) 



Measres of Dispersion , Skewness and Kurtosis, Moments... 


77 


= £ (x')-x 2 

= /V-^i') 2 ..-(2) 

where V-f = E{x) =.v 

,= £{(.v-x) 3 } 

= £(x 3 ) —3 £(.x 2 ) 3c-3 £(*) .y 2 -* 3 
= p 3 '~3/Vpi'4-2/V 3 . ...(3) 

£/< {AW} 1 

= 27/, {A'i-—4A-, 3 </+6.T, 2 < / 2 -4A' 1 d 3 + </ 1 } 

-/v t 

pi'+6Pft*i 2 — ; f*i' 4 ...(4) 

Here also 1 — 4 + 6—3 = 0. 

The formulae from (1) to (4) should be committed to memory. 
The reader should keep in mind that in each of these relations, the 
sum of the cofficients of various terms of the right member is 
equal to zero and that each term of the right member is of the 
same dimension as the term on the left. 

Conversely 


y„' = ^-27 fi ix*-a) n = 2 ft (.v— x+ x-a) n 
M <-i Jy 4 i 

= ^ £ fi (Xi +d) n f where x/~xs-x 9 d=x — a 

= 7/,*,"• + (" ) dSfiXi-'+Q )r/ 2 S/, AV"- 2 + ... 

dZfiXj+d*Efi 

4 

=^+(" y„-,</+(" d‘+.. +^ ; " 2 ]/<, 

since Mj = 0 
Particularly, 

\ l *=lH+d 2 t Ht=p a +Mp*, n/=p-t + 4d/xj + 6d'i ** + d‘ 

3 14 Effect of change of origin and scale on moments. 

If the variate x is connected with u by the relation 

x-a . 

m=—— or x=a-\- ii/i. 
h 



then x=a+w/;and x — x~(n — u)/i 



73 


Mathematical Statistics 


The rth moment of x about the point a is 

f*r'=-2 2 ft (Xi-aY 
Jy i 

S fuf h r =h r X rth moment of u about the 

origin. 

Also / th centra! moment of x is 

2 f, (Xi-XY 

i 

= -- S fi (tti — u) r h r =h r xrth central moment of m. 

/V i 


x —x 


In particular, if u= -, then .v=.v4- a* 7 * 

G J0 

yields E(x) = E(x +ii(J x ) = x-\-a a E(u) 
or x—x+OxHu) => £(//) = 0 
Also V(u)^E[u-E{u]'^E(u~) 

^ r ; ~-T=- - 1 . ^ (.v—.v)==i 

L <*x J a x" 

Thus we see that the distribution of u has zero mean and 

x — V 

must variance or we say that -— is a standardized variate and 


a 


the distribution cNpasfcs as a function of it is in standard form. 

The variable u is a pure mumber, since x and a arc in the same 
units. 

3 15 Sheppard's Corrections. In grouped frequency distri¬ 
butions the class intervals are replaced by their mid-values and 
it is assumed that corresponding class frequencies are concentrated 

there. Naturally this assumption introduces an error due to 

grouping In certain circui.stances it is possible, as sheppard 

showed, to allow for the error of grouping. The corrections as 

applied to the grouped moments, are as follows. 

Hi (corrected) = p./ ") 

, , /r 

gu (corrected )=fij- 


I 


12 


/» 3 (corrected) = />..; 


I 

y 


I’-i (corrected) = /*»- * lr /< 2 t -—h* 

40 


( 6 ) 


where h is the width of the class intcivals. 




Measres of Dispersion , Skewness and Kurtosis , moments... 


19 


It is by no means certain that the use of these corrections 
will improve the estimates of the moments of the parent popul¬ 
ation, but very frequently it will do so. As wc shall see later that 
for small samples, the sampling errors of the moments will greatly 
exceed the corrections, and it is not worth while applying shep- 
perd’s corrections unless the sample consists of at least a few 
hundred indinduals. The corrections should be used only when 
the distribution tails off gradually at both ends and the frequency 
distribution is continuous. 

Ex. For a grouped distribution, show that of h<\ fT ", shrp- 
pard’s correction makes a difference of less than 0 5 percent in the 
estinate of <r. [Lucknow 4S} 

Sol. h<\o => — 

Now (*•).——£> -fig** 


or 



108 




approx. 



i.e. o — ((t) 0 < '5 percent of o. 

316. Charlier’s Check. There arc various’ checks in use for 
the arithmetic of calculation, we have 

2 /(5 4-1 ) 3 = 27< / S 3 ) -I- 3 2 (f ?) 4- 3S (ft) + N 

S/(^4-J) 4 =S(/^)4-4L(/5 a )4-6S(/^)4-4L(/^)4-A r 
and so on. 

Thus, in calculating 2 (/£ n ) we also find 2' {/ (£4-l) n J, and 
thus, together with the sums of lower orders, will give us a ready 
check on the work. 

317 Measures of Asymmetry or Skewness. When the n can is 
taken as origin x= 0, it may hapren that / {x)=f{—x), so that 
the distribution is symmetrical ; for example, the distribution of 
number of heeds in a throw of // symmetrical coins, described I y 
the generating function (</4 -pt) n , where p—\, (j -- symmetrical 


about x^—. Also the continuous distribution described by 


dp ~ 


1 


y/ (2r.) 

is symmetrical about x=p. 





80 


Mathematical Statistics 


Lack of symmetry i.e. skewness is revealed numerically in 
various ways. 

Various Measures of Skewness. In a symmetrical distribution 
the distances of the quartiles Qi and Q 3 from the median (? 2 will 
be equal. In a skew distribution the difference between Q3 — Q2 
and Qo — O y gives a coefficient of skewness, narmely 

{(0.-G.)-(0.-ei)}/^-«23-20 1 +e 1 )/a 
o in the denominertor is introduced for the purpose of removing 
arbitrary units of scale and obtaining an absolute coefficient. The 
coefficient of skewness, in case g is not known, is given by 
(Qs-2Q. i + Q l )l(Q s -Qx)- 

The third moment about the mean, p 3 gives a natu.al measure 
of skewness. If the distribution is symmetrical, /i 3 = 0. If the long 
tarl of the distribution is on the side of the positive values of .y, 
the cubes of positive values of .y outweigh the cubes of negative 
values, so that \l s is positive, and we have positive skewness. In 
the same way if the frequency curve has a longer tail on the side 
of the negative values of .y, then p 3 is negative and we have 
negative skewness. 

The moment-measure of skewness of a theoretical distribution 
is defined by 

ft 1 = ^—3 or by Yi = ±\/Pi=±^l . 

The former is Karl Pearson's notation and the latter Fisher’s. 

y, is a pure number, 0 for a symmetrical distribution, positive 
for a distribution with a long tail to the right (because of the 
high contribution to (v — -V 1 3 for large positive values of ,y) and 
negative for a distribution with a long tail to the left. 

Another measure of skewness (due to K. Pearson) is defined 
by 

(Mean — Mode)/(Standard Deviation) 

For in a skew curve the mean, median and mode are not the 
same. 

Like H-s it is positive for positive skewness, zero for symmetry, 
negative for negative skewness. 

3 18 ft and ‘''-coefficients. 

Certain quantities calculated from the moments about the 
mean are of penticular importance in statistical work, we define 

o _/V „ /'-« 


Measures of Dispersion, Skewness and Kurt os is. Moments. 


7i =+ v /(3i. r, = ^ 2 —3 = - 4 3 a 2 • 

- I * 

These four coefficients are all pure numbers and as such, are 
independent of the scale of measurement of the variable, for ^3 
has the dimensions (variable) 6 and so has t i 2 3 , and hence their 
quotient has dimension zero, i.e. is a pure number, and similarly 
for the quotient of ^ and ft 2 2 , each being of dimensions (variable) 4 . 
3 19 Measure of Flattening or Exces s, kurtosis. 

Two distributions may have indentical mean, standard devi¬ 
ations and skewness, and yet may differ in that the curve of the 

one may be more flattened at the centre ( platykurtic) than that 

of the other. . . . 

The degree of flattening is measured by the fourth moment 

about the mean, ft*. Removing arbitrary units of measure, we 



obtain p 2 =P 4 /P 2 2 . In case of probability curves with /x 2 -o 2 -l, 
the ordinate at the mean or mode is greater or less according as 
itself is greater or less. Thus the value of p 2 serves to indicate 
whether the curve is tall and slim at the centre ( ieptokurtic ) or 
squat ( platykurtic ). The relative flatness of top is called the 
kurtosis and is measured by p«. As we shall see later, the va ue 
of is 3 for the normal probability curve. 

Hence /3 2 — 3 is sometimes called the excess. Curves for which 
p 2 <3 are called platykurtic and those for which /? 2 >3 are called 
Ieptokurtic , the normal curve being taken as standard. 

Ex. 1. Show that for discrete distribution p 2 ^l* 

f Aorn T.iirknow 661 


Sol. We have 

1- C 4 1 O 4 

Without loss of generality we can assume that the 
the distribution i.e. x = E(x) =0, so that 

T ? x V f x ‘‘ 


...( 1 ) 
mean of 



n 


Mathematical Statistics* 


Now 


£■(*(*-»V=jL E {^i_2<t% ( ”-+o«P 


ie. 


-i: x t r-2<fi~ 2x,"+a c ’ 

n- * n < 


I 


— Z Xi*'-~2v* +ai- 
ri i 


L 2 W - 

n 4 • 




Hence 

a* n x \ a “ / 


The right-fiand side being sam of squares is always greater 
than or equal to zero: 

Pr-l>(k>p&n. 


Kx. 2 . The first four moments of a distribution about value ' 
4 of the variable are —15, 17, -30 and 108. Calculate the- 
central and simple moments , and shite whether the distribution is' 
teptokurtic or plalykurtie. [Agra-1960]! 

Sol. We have 

£(*-4) = ~t\5 ^ pi=E(x) =4-1* 5"= 25- 
E(x- 4) a **7 or /Tfx 2 —8x— 16) = 17 
f.e. ft'-8^'- 16-17 ^ ^'=£(*2)= 21 

E(x-4)*= -30- => CYx 31 - 12x 2 +43*—64)=—30‘ 

Whence /V-12#,'+ 48/*,'-64=-30 => ^'=166. 

£^-4) 4 =108 or iT( a 4 - 1 6 a 3 + 96 a 2 —2 5 6x + 256) == 108 

Whence /V-16/V + 96/V 256/V +256= 108:*/**'== 1122, 

Ceritraf Moment's 

=/**/'— (pi’)*= 21 — (2 5) s =1475 
f*3^K/--WAV+2/V 3 => ^=39*75 

^=ii42'3^12^. 

._^_H2*3125 

p *~p* Y 14 75 ja <3 » nence the curve is platykurtiV. 


Measures of Dispersion, Skewness and Kurtosis, ...oments 


320. Factorial Moments. Absolute Moments : 

Th z factorial moment of order r about x = 0 of a distribution 

es defined by 


= *fi N=Sf 

JV i i 

where x W) = x ( x —l)(x — 2)...(x—r+1 ). 


...d) 


Thus ^ / ( „=. r - 2f x i {x i -\)^E{x(x-\)} = E(x 2 -x) \ 

M i | 

= E(x 2 )-E(x) = t where /*,'=£(*)=* I (A) 


V .\n = E{x{x-\)(x-2)) = Eix'-lx-+2x) I 

— /-V—3/Ltj. , 4- 2/ i j / J 

The absolute moment of order r about x = 0 of a distribution 
es defined by 


tv --n S/. I x, N=Z ft 

IV i i 


• ••( 2 ) 


Remark. If the factorial moments are known, it is easily 
seen that the ordinary moments can be calculated successively 
using relations (A). 

The expression for the factorial moment fx (r> about the mean 
is obtained from that definining P\ r) by replacing x by the devia¬ 
tion x—x from the mean. The factorial moments about the mean 
are connected with those about the origin by the equations 

E{(x-x)(x -x— 1)} 

= E{x(x-\)-2xx+x-+x} 

= E{x(x—])} — x i + x 1 since E(xx) = x £(*)=*- 

<=fi' ( 2 ) -X 2 + X 

Similarly 

M(3) = /M3) 3 /X (2) x-\-2x 3 2 *. 

EXERCISES 

Ex. 1. Calculate the mean and the factorial moment of k** 
order of the Poisson distribution with parameter m. Also make 
use of the factorial moment in finding the variance of the distri¬ 
bution. 


r,„ 


... m ® 


Hint, = = £ x.e"- —, 

x-0 A • 


For Poislon distribution P{X 


-m m * 1 

n J 




M'athema tical Statistics 



CO 

— e~ m m 2 


m v ~ r 00 mv' 

- rz — Trr = e ~ m m2 —r 
1 (* — I) ' v=o y l 


=e~ m m e m —m 
y'a t = E{x(.x -]}...(x-k+ 1)} 


CO 


m 


= 2 x(x—\y...(x—k+l)—ze 

a-o X I 


tr* 


00 

2 


m 


t (x—k) i 


\ • e 


—rr. 


00 


-m h 2 


//i 






X-Jb (*-Ar) ! 


o? „,v 


27 —, 

v -o V ! 


rrv 


— m* 


With &=2 we have 

W 2 = £{.Y 2 — *) => £V.Y 2 )=ttJ 2 + r7? 

and hence n 2 = E(.\") — [F{. y)} 3 

—r/ 1 2 -f m — tn 2 = /?;. 


moments 


j 

Ex. 2. (i) Prove the following invequality for the absolute 

. — _ A. _ 


»V r *' r r _] 


(ii) Replacing r successively by 1,2, . , r and multiplying al? 
fhe in equalities thus formed, show that 

v l/r < .K/r+l) ^ 

r ^ ’ r ’ ~ [M.A. Poona 592 

Ex. 3. Show that for the binonial distribution, 

P(X=x) = nc m p z q"-*. x~ 0, 1, 2, . n 

the factorial m ments about the mean are 

/*<«“' P<t* l l (r,)~ — 2 n /»<7 f/-» 4-1) 
and that, for the Poissonian distribution 

— — 2w. 

Ex. 4. The first three moments of a distribution about the 
value 2 of the variable are 1, 16 and -40. Show that the mean 

h 3. the variance 15, and ,t 3 = -86. Also show that the first three 
moments about .y=0 are 3. 24 and 76. (Agra 63? 

Ex 5. In a continuous distribution, whose relative frequency 
density is given by/ (.y) = 3.y (2-*)/4, the variable ranges from 



LMeasrures of Dispersion, Skewness and Kurtosis , MometUs... 


is 


*0 to 2. Show that the distribution is symmetrical, with mean x = l, 
•and variance 1/5. Show that the second and third moments about 
x=0 are 6/5 and 8/5 respectively, and verify that /^=0. 

Ex. 6. Show that for the continuous frequency distribution 
dF=ae~* x dx , as on 0<.x<co, a>0 

>the mean is \/a and the variance 1 /a 2 . 

Also .prove that the second and third moments about x^v 
•are 2/a 2 and 6/a 3 respectively, and that ft3= 2/a 3 

3 21. Moments for a bi-variate frequency Distribution. 
Suppose the pair (x it y. { ) oceurs with frequency/i 

where i=l, 2, 3,.. , n and S/,=M The assemblage of pairs of 

•<-1 

values, together with their frequencies, constitutes a bivariate fre- 
•quency distribution. The pairs of values may be the heights and 
'the weights of a group of men, or the amount of fertilizer per 
•acre and the yield of grain per acre on a number of different plots 
•of land 

The moment of order r in x and s in y is defined by 

^' rs =-^2/< *t y? 

Jy i 


In particular 

I* io - jy S fi - v » = 



•where x is the mean value of x 
value of y. Similarly, 



in the distribution, and y the mean 




Where o* 2 =ij.? 0 denoting respectively the variance of the vari¬ 
able x in the distribution and the moment about the mean (*, ?); 
likewise c u 2 is the variance of y t or the moment ^ 0 > about the 


mean. 

Corresponding to the moment F’ u about the origin we have 
the moment i about the mean of the distribution defined by 


— d ft (-*» x)()u y). 

ft u is called the covariance of the variables and is somelm** 
•denoted by cov (x,.>’)- 


Mathematical Statistics 



Now 

/*n=£{(*-.7) O'-J’J} 

= E{xy -xy— xy +77) 
—E{xy)—y E(x) — xE(y)A~xy 
=E{xy)-~x~y 


*y, where l*' n = ~ S/< x t y* 

This formula is very important, and will he used frequent?/. 

Exercise 

1. Explain the significance of the following and give their 
measures, (a) variance (b) quartile deviation (c) coefficient of 
variation (d) skewness and kurtosis (r) Posili\e and negative 
skewness. 

2. Calculate the arithmetic mean and the standard deviation 
of the following values of the world's annual gold output (in 
millions of pound) for 20 different years : 

94 95 96 93 87 79 73 69 63 67 

78 82 i3 89 95 103 108 117 130 97 


[M. Sc. Agra 59] 

[Ans. Mean -- 90 15. s.d.= \5 99 approx] 

3. G nils scored by two teams A and 3 in a foot ball season 
were as folio ' s : 


No. of goals scored in a match : 


No. of matches 



A : 
3 : 


0 

27 

17 


which team is more efficient ? 

(Ans. C.V. for .1 = 1 15. 455, C.V 
consistent than team A] 


12 3 4 

9 8 5 4 
9 6 5 3 

for Z?=1C9 167, /?, is more 


4 (a) The means of two samples of sizes 50 and 100 respectively 
are 54 4 and 50 3 and the standard deviations a re 8 and 7. 

Obtain the mean and standard of the sample of size 1 50 obtained 
by combining the two samples. (Delhi 66} 

[Ans. Mean--51 57, SD =7 5 approx] 

(b) The first of two simple*, has 10d items with mean 15 and 

standard deviation 3 If ihe whole group has 250 items with mean 

15 6 and sfandaid deviation \'(13'44), find the standard deviation 

of the ‘ecomi group. (Poona 67] 

[Ans. 4J J 



measures of Dispersion, Skewness and Kurt os is. Moments . 


87 


5 The first four moments of a distribution about the value 
■5 of the variable are 20, 40 and 50. Show that the mean is ,1 

variance 16 , and 0 , = 162. [Luck "°" 

6. Calculate the first four momeilts about the mean fo. the 

tfollowingdata Also calculate Pi and p 2 . 

x . ,1 2 3 4 5 6 7 -8 J 

l 6 1 3 25 3 0 2 2 9 5 2 

[Ans. ,.',=-009, ^=2 49, PJ =6 7, P 4 = l£'33 approx. 

e 1= =0’03 approx. & = 3 approx} 

7. for a distribution of 250 heights, calculations showed 

that the mean, standard deviation, P, and & were 54 inches, 3 
ranches, 0 and 3 inches respectively. It was however discovered on 
•checking that the two items 64 and 50 in the original data were 

wrongly written in place of correct values^and 52 '"chewespe^ 


«»i -- * 

'Ctively Calculate the correct frequency constants. 
[An* Mean remains unaltered, s.d =^*968 

^=* 004, ^=2-815] 

^8, Show that for discrete distributions 

&>Pi 

V. A variate has the density function 


[Lucknow 66] 


/(x)= I ^(3+x) 2 , -3<x<-l 

=72 (6-2* a ). — 

1*0 

— ~ 13 —x) 2 , 1<X<3 

•Find the mean and standard deviation of the distribution^ ^ 

I a |%c = S#D ^1 

10 Prove that the Geometric mean, G, of the distribution 

^F=6{2-x){x-i) dx, l<x<2 

.[Delia 56] 

*is given by 6 log (l6v/) 

11 For the continuous distribution 

V-„ (x-x=)dx,0^x<l, y Q being a constant, find the 
(Arithmetic mein, the Harmonic mean, the Mode and the Mediam 

dAns. JV=6, AM=i, H M =i. Modal value-J, Median-fl 

12. For the continuous distribution 

dF=--l a - [2-x)Jx, 



Mathematical Statistics 



find the mean, mean deviation about mean, / 2 , \i' 3 , j/ 4 . fx 2 , fa, P* 
and also skewness. Also show that for this distribution /* an+1 = 0. 
Deduce ft. [Delhi 58, 60] 

[A ns, Mean = l, ^' a =§, f* 2 = 6 , /*' 3=1, f* s=0, skewness =0* 

mean deviation = fj 

13. The elementary probability law of a continuous random 

variable x is p(x)=y 0 e a \ o<.t<oo, where a, b , y 0 are 

constants. Show that y 0 =b=a — \ and a=m — a where m and 
are the mean and standard deviation of the distribution. Show 


also that ft = 3, ft- 9. [Delhi 57, 60] 

14. If x is chosen at random in the interval ( a , /;), what are 
E(x) and V(x) ? 


[ 


A., *+»,»=? I’ 

2 12 


i 


15. State and prove Tchebycheff’s inequality and show by 

means of an example that it can not be improved. [Bombay 64J 
16 (a) Show that of X is non-negative 

\Xj EiX) [M A. Statists, Delhi 70] 

(b) If x, y are independent continuous random variable with 
mean /.* .v) and y ) and variance c x - and a/ 2 respectively, show 

that 

(i) o*(.vy) =<A'.\> 2 (,) + p 8 (.v)ff 2 (y) -f ^(y)<r*( x ) 

[M A Statistus, Delhi 70 1 ) 

r \ (at) _ 

11 l ftA")J“ [£(;■)] • = C v 2 + Cx 2 + C v 2 

- jm'h rfe- W( ! 


where C x - 



4 


Theory of Probability 


4 1. Introduction. One of the fundamental tools of statistics 
is probability, which had its beginnings with games of chance in 
the seventeenth century. Gaines of chance include such actions 
as throwing a dice, tossing a coin, drawing a card, etc. in which 
the outcome of a trial is uncertain. However, even though the 
outcome of any particular trial may he uncertain, there is a 
predictable long-term outcome. It is known, for example, that 
in many throws of an ideal coin about one-halt of the trials will 
result in heads. 

A similar type of uncertainty and long-term regularity often 
occurrs in experimental science. For example, in the science of 
gen. tics it is uncertain whether an offspring will be male or female, 
but in the long run it is known approximately what per cent of 
offspring will be male and what per cent will be female. 

Wheneve we use methemetics in order to study some obser¬ 
vational phenomenon we must essentially begin by building amathe- 
metical model (deterministic or probabilistic) for these phenomena. 

There are many examples of‘experiments’ in nature for which 
deterministic models are appropriate. For example, the gravita¬ 
tional law describes quite precisely what happens to a falling body 
under certain conditions There are also many phenomena which 
require a non deterministic or p rot ahilistic models for their 
investigartion. For example let us consider a piece of radioactive 
matrial which is emitting a-particles, we may count the number of 
a particles emitted during a specified time interval, but obviously 
we can not predict precisely the number of particles emitted, 
even if we know the exact shape, dimension of chemical c unposi¬ 
tion, and mass of the object under consideration. 1 hus in this 
case, there seems to be no reasonable deterministic model yielding 


the nurn er of particles emitted, say 
pertinent characteristics of the source 


//, as a function of various 
material, we must consider, 


instead, a probabilistic model. 



90 


Mathematical Statistic f 


In a deterministic model it is supposed that the actual out¬ 
come is determined from the conditions under which the experi¬ 
ment is carried out. In a non-deterministic model, however, ihe 
•conditions of experimentation determine only the probabilistic 
behaviour of the observable outcome. Saying it differently, in a 
•dcteroninstic model we use “physical considerations*’ to predict 
the outcome, while in a probabilistic model we use the same 
kind of consideration to specify a probability distribution. In 
or cr to check the validity of a model, we must deduce a number 
of consequences of our model and then compare these predicted 
results with observations. 

4 2. Definitions of Various Terms. Before passing to the 
classical definit : on of the nation of probability, we wish to define 
ond explain various terms involved in the definition of probabi¬ 
lity 

Trial. Assume that one is performing an Csperiment or 
observing n pheonomenon The repitiiion of the experiment or 
phenomenon under essentially the some conditions does not yield 
unique results hut may result in any one of the several possible 
-outcomes of the experiment or observation. Then the experiment 
or phenomenon is called a trial, and the possible outcomes of the 
experiment or observation are teamed as n’enfs or cases or iv..y.r. 
An event is nothing but the occurrence or non-occurrence of a 
natural phenomenon. We define later that an event is a subset of 
'ihe sample description space S. 

For instance : 

(i) Tossing of a coin is a note worthly example of a trial 
whereas getting a head or a tail ri an example of an event. 

(ii) Throwing of a die is a trial and getting 1 (or 2 or 3,...or 
<)) is an event. 

(iii) Drawing three cards from a pack of well-suflfled cards 
is a trial and obtaining of a king a queen and a knave is an event. 

Exhaustive Events Exhaustive events or cases are defined to 
be the total number of all possible outcomes in a trial or obser¬ 
vation . 

For examp’e : 

(1) In tossing o a coin we have at hand two exhaustive 
ca^cs, viz, head or tai!-(provided that we have ignored the possi¬ 
bility of the coin standing on an edge). 

(2» Throwing of a dice furnishes 6 exhaustive events. 



Theory of Probability 


91 


(3) Throwing of two dice give 35(= 6 2 ) exhaustive cases 
because any of the 6 numbers from 1 to 6 on the first dice can 
be asoociated with any of the 0 numbers on the other die. 

In general, throwing of a n dice produces 6” exhaustive cases. 

Farourable Events or Cases The cases which entail the 

happening of an event are said to be farourable to the event. For 
example : 

(1) In the tossing of a dice, the number of cases farourable 
to the appearance of a multiple of two are three, viz, 2, 4 and 6 

(2) In throwing of two dice, the number of cases favourable 
to obtaining a sum 7 is (I, 6), (6, 1), (2, 5), (5, 2), (3, 4), (4, 3), 
i.e. 4. 

Mutually Exclusive Events Two events, E and Fthat can not 
occur simultaneously, so that their joint occurren:e EF is 
impossible, are said to be mutually exclusive (cr incompatible or 
disjoint). Hence two events, E and F, are mutually exclusive iff 
EF=<j>, wheie <f> is an impossible event. Tn tossing a coin, the 
events : head and tail arc mutually exclusive. 

Equally Likely Events. All descriptions or outcomes of a 
trial are said to be equally likely if they have equal probability of 
occuring for example : 

(1) As the result of drawing a card from a well-suffled pack, 
any card may appear in the draw so that there are 52 equally 
likely events. 

(2) Head and tail are equally likely events in tossing an 
unbiased or uniform coin. 

(3) All the six faces of an unbiased die are equally likely 
when it is thrown. 

Independent Events. Events are said to be independent if the 
occurrence or non-occurence of an event is not affected by the 
supplementary knowledge that concerns the occurence of any 
number of the remaining events. In tossing an unbiased coin 
the event of finding a tail in the first toss is independent of finding 
a head in the second and subsequent throws. 

Classical or Mathematical (or a Priori) Paobability : 

Definition Assume that a trial results in n exhaustive, 
mutually evclusive and equally likely cases, and n A of them arc 
favourable to the occurence of an event A. Then the probability 
of occurrei.ee of A is defined by the fraction. 



92 


Mathematical Statisics 


_ . .. Favourable number of cases _n,i 

I if ) — — • ■ i ■ ■-—— 

Exhaustive number 01 cases n 
(1) is also equivalent to the statement that 'the odds in favour 
cf A are n,i : (ii-nX) or the odds against E are (n—nX) : n\ 

Clearly the number of cases favourable to the non-occurrence 
of the event A are (/; — ha). The probability of non-occurrence of 
A is then 


a ha i a a | p/ 

q= -— 1 —— = 1 — p {A) 

1 n n 

= 1 -p i (r(A)=p 

and hence p-\-q=\. •••(2) 

The probability that an even number will appear when a die 
is tossed is 3/6 or 1/2, for three of the six possible outcomes have 
this attribute. 

To consider another example, suppose that a card is drawn 
at random from an ordinary deck of playing cards. The proba¬ 
bility of drawing a spade is easily seen to be 13/52 or 1/4. The 
probability of drawing a number between 5 and 10, inclusive, is 


24/52, or 6/13. 

We note that, by the classical definition, P{A) is always a 
number between 0 and 1 inclusive. Clearly P{A)=daIh must be a 
pioper fraction, since ha>h. If an event A is certain to happen, 
P(A)= I ; if it is certain not to happen, P{A) = 0. The probabilities 
dedeternnned by the classical definition are called a priori pro¬ 
babilities. 

Remark. Let A be an event on a sample space 5. Then for 
any A on S, 

rt 4)- ,,A ^ Sire of A 

n Size of S ...( 3 ) 

This formula can be stated in words : If an event is defined 
as a sub.ct of a finite sample description space, whose descriptions 
(or out comes or cases) arc all equally likely, then the probability 
of the event is the ratio of the number of descriptions belonging 
to it to the total number of descriptions. 

This statement may be regarded as a precise formulation of 
(he claesical‘equal-hkelihord’ definition of the probability, first 
explicitly formulated by Laplace in 1812. 


Defects of the Classical Approach. There are some rather 
troublesome defects in the classical, or a priori, approach. 

(*) O.niously the definition of probability must be modified 
'’ hen the total number -»f p »ss !»!e o iteomes i> infinite. 


93 


Theory of Probability 

(n) What happens when cases are not equally likely ? 
Suppose that we have a coin known to he biased in favour of 
heads. The two possible outcomes of tossing the coin are not 
equally likely What is the probability of a head ? The classical 
definition leaves us completely helpless here. 

(iii) Another difficulty with the classical approach is en¬ 
countered when we try to answer questions such as - he following : 
What is the probability that a child born in Delhi will be a boy ? 
What is the probability that a light bulb will burn less than 100 
hours ? In all such type of questions, it is practically impossible to 
enumerate all the equally likely cases. Thus we shall have to 
alter or extend our definition to bring problems similar to the 
above into the frame work of the theory This more widely 

applicable probability is called a posteriori probability or frequency 
probability. 

Ex. 1. What is the chance that a leap year , selected at 
random , will contain 53 Sundays ? [Agra 55) 

Sol. In a leap year there are 366 days and therefore contains 
52 complete weeks and 2 days more. These two days may make 
the following 7 combinations : 

1. Monday and Tuesday 2. Tuesday and Wednesday 

3. Wednesday and Thursday 4. Thursday and Friday 

5. Friday and Saturday 6. Saturday and Sunday 

7. Sunday and Monday. 

Of these seven equally likely cases the last two are favourable. 
Hence the required chance = 2/7. 

Ex. 2. A and B stand in a ring with 10 other persons. If the 
arrangement of the 12 persons is at random , find the chance that 
there are exactly 3 persons between A and B. 

Sol. In the circular permutation 12 persons can be arranged 
in 11 ! ways. Fixing the positions of A and B , the three persons 

between A and A can be arranged in 2x 

i.e. 2X10!. 

Hence the required chance — ^ ‘ =: 

Ex. 3. From a pack of 52, three are drawn at random. Find 
the chance that they are a king, a queen and a knwe. [Agra 69) 


C 3 ) X 3 * X 7 ! ways 



94 


Mathematical Statistics 


Sol. Total no. of ways in which 3 cards 

can be chosen= 

Number of ways of choosing a king, a queen 
and a knave=4x4x4 

. \/5i\ 16 

required chance=^4\l 3 l~ 5525 • 

Ex. 4 A card is drawn from an ordinary pack and a gambler 

bets that it is a spade or an ace. What are the odds against his 

. • ii hot ? [Agra B. Sc. 60] 

winning this net f 1 b 

Sol. Number of ways in which a card can be drawn = 5- 
Number of ways in which the card drawn is a spade 

or an ace=13+3=16 

Hence the probability of the gambler’s winning the bet 

_T6 = 4 

52 13* 


Hence the odds against his winning the bet are 



Ex. 5. If n hi.yen its be distributed at random among N 
beggars , what is the chance that a particular begger receives r[<n) 


biscuits ? 

Sol We take a biscuit and it can be given to anyone of the 
N beggars. Hence n biscuits can be distributed among N beggars 

in Ny NX ■ • n iime$=iV n ways. 

Suppose a particular lugger receives r biscuits which can be 

chosen in ^ n j ways. Then n—r biscuits can be distributed 

among N—\ beggars in (N-l) n_r ways. Hence the required 

probability is 

( ” ) (N-\) n r /N n . 

Ex 6 . There arc three events • 1, B , C, one of which must, 
and only one can happen ; the odds are 8 to 3 against a, 5 to 2 

again U B ;find the odds against C. 

Sol. Ihe chance of A happening is 3/11, the chance of B is 

2/7, and the chance of C is 1—3/11 —2/7=34/77. 

Thus the odds against C are 43 to 34. 

Ex. 7. A party of n persons sit at a round table, find the 
odds against two specified individuals silting next to each other. 

[Punjab 58] 


Theory of Probability 


95 


Sol. The number of ways in which n per c ons can sit at a 

round table is (n— 1 ) j and two particular persons can sit side by 
side in 2 (n— 2 ) f ways. 


.*• the required chance = —— - -- _ _ 

(n — 1) f n— 1 

the odds against the event are (;i —3) to 2 . 

Ex. 8. In a hand at whist what is the chance that the 4 kings 
ore held by a specified plaxer ? 

Sol. The number of favourable ways is the same as the 
number of ways in which the 9 other cards forming- the hand can, 

be chosen, which is V 

And the total number of ways in which the hand can be made 



the required chance 


-( 


48 

9 



52 

13 


)=-l±- 

) 4165 


Ex. 9. There are three work v, one consisting of 3 volumes, 
one of 4, and the other of 1 volume. They are placed on a shelf at 
random, prove that the chance that volumes of rhe same works are 

oil together is 


140 


Sol. The 8 volumes can be placed on the shelf in 8 l ways 
volumes of the same works vviil be altogether in 3 !X ? !x4 1 ways 
for the sets of volumes admit of 3 ! permutations, and the volumes 
in two of the sets admit of 3 ! and a ! permutations respectively. 


Thus the required chance = 


3 \y 1x4 ! 
8 ! 


3 

MO"' 


Ex. 10 A and Bare two independent w/it nesses (i e ) there 
Is no collusion between them ) in a case The probability that A 
will speak the truth is x, and the probability that B will speak the 
truth is y. A and B agree in a certain stutenu nt. Show that the 
probability that this statement is true is 

xy _ 

r^x—y+2xy 

Sol. Either both A and B speak the turth or both speak the 
falsehood. The respective probabilities for these are xy and 

(1-X) (l_y). 

•*. the probability that the statement is true 



96 


Mathematical Statistics 


xv 


xv 


'"y + O-*)(!-.>’) 1 -x-y+2xy 

Ev 11. A bag contains 6 white and 9 black balls The draw¬ 
ings „f 4 balls are made such that (a) the balls are replaced before 
the second draw, (h) the balls are not replaced before 
Find the probability that the first drawing will give 4 white and t/.c 

second 4 black balls in each case. 

Sol. (a) 4 white balls can be drawn in ( jj jwaysandany 

4 balls can be drawn in ( ’ ® ) ways. Since the balls are replaced 
before the second draw. 

ii!i,y=( 4 ) ( - 


The required probability 
in case (a) 


* v _ 

(I s )■•(?) 


5915 


(b) When 4 white balls have been drawn, the number of 
remaining balls is 11. out of which 2 are white and 9 are black. 

( 6 W 9 ) 

Hence the required probabitily= V 4 / \ 4 J __ _ 

in case (b) /15 j | H j 715 

Ex. 12. Six cards are drown from a pack of 52 cards. What 
is the probability that 3 are rid and 3 are black ? [Agra 53, 56] 

Sol. Six cards can be drown in ^ j ways. 

Three red and three black cards can be drown in^) x ^ 3 ) 
ways. 

26\ I /5213000 
. 3// \ 6 / 39151* 

Ex. 13. A bag contains 4 white. 5 red and 6 black balls. 
Three are drawn at random. Find the probability that (a) no ball 
drawn is block, (b) exactly 2 are black , (c) all are of the same 
colour. [Delhi Hod’s 53] 

Sol. (a) When no ball drown is biack, the possibilities are ; 

A . V 


» • 

Hence required probability = ^ 3 ) X ( 


(i) 3 white and no red—this admits of ^ ^ jways=4 

(ii) no white and 3 red —this admits of ^ ^ways =10 



Theory of Probability 


97 


(iii) 1 white and 2 red —this admits of 

(iv) 2 white and 1 red-this admits of 


f 4 ^ X f $ \ ways = 40 

W 12/ 

(2 ) X (l ) ways=3 ° 

Total = 84 


Three balls can be drawn in ways. Henee the required 

probability for case (a) = 84\^ j = 12/65. 

(b) When 2 balls drawn are black; the possibilities are : 

(i) 2 black and 1 white-this admits of ^ ) x ) wa y s = 60 

(ii) 2 black and 1 red-this admits of ^ ) x ^ j ways = 75 

Total =135 

Hence required probability for case (b) = 135j^ ^ = 27/91 

(c) When the balls are of the same colour; the possibilities are 

(i) all the three white —this admits of ^ ^ ^ways = 4 

(ii) all the three red-this admits ( 3 ) ways =10 

(iii) all the three black-this admits of ( ^ jways = 20 

Total = 34 


Hence required probability for case (c) = 34/l5 c 3 — 


34 


Ex. 14. A symmetrical coin is tossed four times. What is the 

probability of at least two heads ? 

Sol. Since each toss can result in one of two equally likely 
outcomes the total number of equally likely possible outcomes of 
f ,ur tosses is seen to be 2 x 2 x 2 x 2 = 16. 

There are eleven favourable out comes, namely, 

JIHHH . HHHT\ HIITIf, HTIIH, THHH, HHTT y 
HTHT JITTII, TIIHT , THTH, TTHII. 

Then the required probability is 11/16. 

Ex. 15. A set consists of n counters. What is the probability 
that a selected group of these of unspecified number consists of (I) 
on even number of counters ( 2 ) an odd number of counters ? 





Mathematical Statistic? 


Sol. We have to find the total number of members of the- 
group that can be formed of 2, 4, 6,.. , counters for the case (1) 
and of 1, 3, 5,..., for the case (2). 

The total number of ways of forming groups of 2, 4, 6,..., is 
respectively nc 2 , nc <y nc 6y ..., and for forming the groups 

I, 3, 3, is nc x nc 3 , nc s , ... 


Thus the number of members of the class of even groups is- 

/7C 8 + wc 4 + ... — 2 n ~ 1 — 1 

» 

and the number of members of the class of odd groups ia 

flci + />c 3 -f-tfc 5 +... =2" -1 

while the total number of members of all classes 

nc t -\-nc s - f- ••• + nc n = 2 n — 1 

Thus the probability of the selected group being odd is- 
(2"~ 1 )/(2 n — 1) and of being even is (2 n_1 —l)/(2" —1). 

The former is greater than the latter. The difference between, 
the two probabilities decreases as n increases. 

Note jO + n*=M' 0 + wr r +/iC 2 -f ...fnc n 

1(1-1)"=/^-/.^ + #^,- ... + ( — l) n r;r n 

whence 2 n ~ 1 = /ic, -f/ir:j-{-//C 5 -b... 

and 2 n-r — 1 — + nr 4 -f ... 

Ex. 16. A list of one hundred items consists of 20 defective 
and 80 non-defective items. Ten of these items are chosen at random r , 
without replacing any item before the next one is chosen. What is 
the probability that exactly half of the ch jsen items are defective ? 

bol. The number of ways of choosing ten items is 100 c 10 . 

Hence the probabiiy of finding exactly 5 defective and 5 non 
defective items among the chosen 10 is given by 

(20 c 5 )(80 r & )/100c IO 

Generalization of the problem. Suppose if the N item? arc 
made up of m A'x and n B's (with m-f n — N) and r of these are 
chosen at random, without rcplacemei t What is the probabity lhao 
r chosen items contain exactly .v A's and (r — a) B's ? 

Sol. If we choose r of the items at random, without replace¬ 
ment. there arc Nc r different possible samples, all of which have 
the same probability of being chosen. 

1 he probability that the r chosen items contain exactly x A' * 
and (r —a) B’s is given by 


P.X) = («<■») (, nc,_.)INc r *°• 1 

This is calleJ hypergeometric probability . 


Theo'y of Probability 


99 


4 3. A Posteriori or Frequency Probability. One of the basic 
•characteristics of the concept of “experiment’ is that we do not 
know which particular outcome will occur when the experiment 
is performed. Saying it differently, if A is an event associated with 
the experiment, then we can not state with certainty that A win 
or will not occur. Hence it becomes very important to try to asso¬ 
ciate a number with the event A which will measure, in some sense, 
how likely it is that the event A occurs This task leads us to the 
theory of probablity. 

In order to motivate the approach adopted for the solution 
of the above problem, consider the following procedure. Suppose 
that we repeat the experiment E n times and let A and B be two 

events associated with E- We let //,i and no be the number ot times 
that the event A and the event B occurred among the n repetitions, 
respectively. 

Definition. /a =«/./« is called the relative frequency of the 
event A in the n repetitions of E- The relative frequency f,\ has 
the following important properties, which are easily verilied. 

( 1 ) 0 

{2) f A =l if and only if A occurs every time among the n 
repetitions 

(3) /a = 0 if and only if A never occurs among then repeti 

tions. 

(4) If A and B are two mutually exclusive events [ i.e. AnB=<j> 
(null set)] and if/ _ is the relative frequency associated with 

A ij II 

the event A{JB then ^ & —f,\+fp. 

There is another important property of relative frequency. 

Let n' r be the number of times the event A occurs among 
the n r repetitions of E Hence the relative frequency of A based 
on r r repetition of E is given by f A = n' r /ri r . The property of A 
referred above states that as r becomes large (that is, the number 
of repetitions of E is increased), the relative frequency based on 
this increasing number of repetitions tends to ’stabilize' near some 
definiie numerical value. The essei ce of this property is that if an 
experiment is performed a large number of times, the relative 
frequency of occurrence of some event A tends to vary less and less 
as the number of repetitions is increased. This charactuistic is also 
referred to as statistical regularity. 



100 


Mathematical Statistic* 


4 4 Basic Notions of Probability. The problem before us is 
to assign a number to each event A which wilf measure how likely 
it is that A occurs when the experiment is performed. One possible- 
approach might he the following one : Repeat the experiment a- 
large number of times, compute the relative frequency f,\ , and use- 
this number. We know that as the experiment is repeated more- 
and more times, the relative frequency f A stabilizes near some- 
number, say p. What we want is a means of obtaining such a 
number p without resorting lo experimentation. We proceed 1 
formally as follows : 

Definition. Let E be an experiment. Let S be a sample space 

associated with E- With each event A we associate a real number, 
designated by P(A) and called the probability of A satisfying the 
following axioms ; 

(1) (X?(4KI 

(2) P(S) = I 

(3) If A and B are mutually exclusive events (that is, they 
can not occur together), P(A U B)— P{A) + P(B) 

(4) If A lt A 2 . . ,d n , .. are pairwise mutually exclusive events 

then 

P (u A, \ = P(A I ) + P(A,( + ... + P(A„) + ... 

For any finite n we have from property 3 

p( s i P(4i). 

\ 1 , <-i 

The choice of the above-listed axioms of probability are 
obviously motivated by the corresponding characteristics of 
relative frequency The property of statistical regularity will be 
tied in with this definition of probability later. We shall show 
that the number P(A) and f A are “close” to each other (in a 
certain sense), if/* is based on a large number of repetitions. It is 
this fact which gives us the justification to use P(A) for measuring 
how probable it is that A occurs. 

Sample Space. With each exxperiment E we define the sample 
space as the set of all possible outcomes of E- We usually designate 
this set by S. Each outcome of the experiment is a sample point 
of the sample spuce. 


Theory of Probability 


10i 

If a random experiment consists of throwing a coin two times, 
<here are four conceivable outcomes : 

(H, H), (//, F) t ( T ., //), (7', T). Thus there are four sample 
points which make up the sample space 

Let us consider each of the experiments below and describe 
•a sample space for each. The sample space S { will refer to the 
•experiment £.. 

£ i : Toss a die and observe the number that shows on top. 

S x : {l t 2, 3, 4, 5, 6) 

Eo : Toss a coin four times and obser e the total number ofheads 
obtained 

S 2 : {0, 1, 2, 3, 4} 

E z : Successive tosses of a coin until a head turns up. 

5 3 : {I, 2, 3,...} 

E i : n tosses of a coin with head or tail as outcome in each toss. 
s * : { all possible sequences of the form a lt a 2i a 3 ,...a n , 

where each a^H cr T depending on whether heads 
or tails appeared on the i ,h toss } 

E b : A light bulb is manufactured. It is then tested for its life 
length by inserting it into a socket and the time elapsed (in 
hours) until it bums out is recorded 

S 6 :{t\ 0} 

E c : A lot of 10 items contains 3 defectives. One item is chosen 
after another (without replacing the chosen item) until the 
last defective item is obtained. The total number of items 
removed from the lot is counted. 

So : (3, 4, 5. 6, 7, 8, 9, 10} 

Ey : Items are manufactured until 10 non-defective items are 
produced. The total number of manufactured items is 
counted. 

S 7 : {i0, If, 12,...} 

A sample space may be finite or infinite, if it contains a finite 
or an infinite number of points, respectively. A sample space 
•containing at most a denumerable number of elements is termed 
discrete. Sample spaces containing a non denumerable number of 
•elements include the so-called “continuous sample space”. When 
the sample space is finite or countably infinite, every subset may be 
•considered as an event. (It is an easy exercise to show that if S 
<has n numbers, there arc exactly 2 n subsets.(events)j. 



102 


Mathematical Statistic j 


An event A in the sample space S is defined to be a subset A 
of points in 5, and when we say “The probability that event A 
occurs” we shall mean the probability that any point of A occurs. 

Let us state and prove several consequences concerning P{A} 
which follow from the above conditions. 

Theorem I. If <f> is the empty set, then P(6) = 0. 

Proof. We may write for any event A , A = A\j<f>. Since A 

and are mutually exclusive, axiom (3) yields 

P(A) = P(A\J<j>)^P(A)A-P(>f>y 

=> P(*)= 0. 

Remark. If P(,-l) = 0, we can not in general conclude that 
A = rf), for there are situations in which we assign probability zero- 
to an event that can occur. 

Theorem II. If A is the complementary event of A, then 

P(A)=l-P(A) 

Proof. Wc may write S= A\JA and using axioms 2 and 3.. 
we have 

P(S) = P(A u A) = P(A) + P( A) 

=> \ = P(A) + P(A) P{A)— 1 —Pf/I) 

Note. This is a particularly useful result, for it means that 

whenever wc wish to evaluate P(A) we may instead compute P(A) 
and then obtain the desired result by subtraction. As we shall see 

later ; in many problems it is much easier to compute P(A) than- 
P(A). 

Thearem III. If A ami B are any two events, then 

P(A\JB)=P(A) + P(D -P(AnB) 

Proof. Here we decompose A U B 
and B into mutually exclusive events 
and then appl\ Property 3 

(See the adjacent Venn diagram) 

Thus we write 

A\JB=A'J(BHA) 

B=(AnB)\J(Bfvt) 




TTheory of Probability 



Hence P{A\J B) = P(A) + P{BC\A) 

P(B) = P{AC\B) + P(BnA) 

■Subtraction yields 

P(A U B)- P{B) = P(A)-P(4n B) 
P(AuB)=P(A) + P(B)~ P(AHB) 

Note. If AnB = <f> i.e. if the sets A and B are disjoint then 
(P{A U B) = P{A) + P(B) which is exactly axiom 3. 

This result is sometimes written as 

P(A + B) = P(A) + P(B) - P(AB) 

•and denotes the probability of the occurrence of at least one of 
the event A y B. 

Theorem IV. If A, B and C are any three events , then 
P(A U B U C) = P{A) + P( B) + P(C)-P(«nB)~ P(A n C) 

-p(Bnc)+P(AnBnQ 

or P(A + B + C) = P(A) + P(B) + P(C)-P(AB)-P(AC) 

- P(BC) + P(ABC.) 

Proof. By the associative property of sets we have 

AUBUC=(4UB)\JC 
By Theorem III we obtain 
P(A UB U C) = P{(A UB)UC} 

= P(A U B) + P(C)-P{(AUB)nC} 

^P(A) + P(B)-P{AnB) + P( C) 

-p{{AnC)yj{Br\C)} 

since (A\JB)C\ C=(AC\C)\J {BC[C) y by the distributive 

property of sets. 

Now / > {MnC)U(£nC)} 

= p(AnC) + P(BnC)-p{Anc)n(Br\C)} 
*=P(Ar\C)+P(BnC)-P(AnBnC) 

Hence 

P(AuBuC) = P(A) + P(B) + P(C)-P(Ar\B)-P(AnC) 

-P(Bno + P(AnBrC) 

Note 1. An obvious extension to the above theorem suggests 
•itSClf. 

Let A it ..., A k be any k events. Then 
‘P(A,+A Z +.. +A t ) = i P(Ai)-^ P(A t A,) 

‘- 1 i<j=2 

+ S P(A i A i 4 r ) + .-+(-l) n '‘ l P{A t A % .. A,). 

J<J<r =3 



104 


Mathematical Statistics 


where the second sum is over all combinations of tbe numbers 
1, 2,..., k taken two at a time, the third is over all combinations 

of the numbers taken three at a time, and so forth 

If all the events are mutually exclusive, then all the probabi¬ 
lities in the sums beyond the first sum are 0. 

Note 2. P{A\JB)<P{A)+Pm 

Since P(AC\B)^ 0 

This is know as Boole’s inequality. 

Generally, 

P{A y + A z +.. + A 2 ) + ...-\-P(A k ). 

The equality sign holds when the events Ai and Aj are mutu¬ 
ally exclusive for all i=£ j. 

Note 3. If P(A t ) be written as p,, P\A t A>) as p,t, 

P(A,AjAr) as pa, and so on, then 

r(A 1 + At+...+Ak)='ZPi- £ Pti+2 . . 

+ ( —lr Pn-k 

5,-5 2 +5 3 

where 5,, 5 2 ,...denote the sum of the probabilities of events taken 

one, two, ... , at a time. 

Illustrative Examples. 

Ex. 1. One card is drawn from a reader deck if 52 cards. 
What is the probability of the card being either red or a king ? 

Sol. Let A = {the card is red}; 5={the card is a king). 

The event of interest is A + B, where A and B are not exclusive 

event. 

/>(S)= 5i = o 

(M) ■ (s*)-5 

Hence P(A or B) = P(A-\-B) 

= ^-+^-1/26=7/13. 


Ex. 2 A drawer contains 50 bolts and 150 nuts. Half of the 
bolts and half of the nuts are rusted. If one item is chosen ai 
random , what is the probahlity that it is rusted or it a bolt ? 

Sol. Let A — (the item is rusted}. />={the item is a bolt}. 


P(A) = 



_50 
2i'O’ 


f{AB) 


_2_5 
20 i 


He nee, PM or B) = P{A) + P{B)-P(AB) 

= 100 M>- 25 

200 200 200 ' 


Theory of Probability 


105 


Ex. 3. An urn contains 11 balls numbered from 
If a ball is selected at random, what is the probability of 
ball with a number which is a multiple of either 2 or 3 ? 
Sol. Let /i = {the ball number is a multiple of 2} 
£={the ball number is a multiple of 3} 
/>(/f) = 5 /ll since A = {2, 4, 6, 8, 10} 

P(B) = 3/ll since B = { 3, 6, 9} 

P(AB)=\/l\ since AB = {6} 


Hence P(A U B)= j| + j?—jj=7/l 1. 


1 to 11. 
having a 


Ex. 4. The problem of ‘rencontres* 

The first n integers are written in random order i.e. ail per¬ 
mutations are equally likely. A match occurs in the rth position 
if the integer r is found to occupy that position. What is the proba¬ 
bility that there is at hast one such match ? 

Sol. Let A t denote the event that a match occurs in the ith 
position. Then 


P(Ai)= ——— for n, there being nc t such terms 

// 0 

P {AiA,)=>— —?--- for i<j, there being //c 2 such terms 

/» ! 

P(AiAjA r )= for i<j<r, there being nc 3 such terms 

/i 


P (Aid 2 



• • • • • • 

there being nc n such terms 


Required probability= P(A X +A 2 -\-... -f A n ) 


n 


(n- I) ! 


n 


! (n-2) 1 


+ 


n 


(n-3) ! 


n 


( n-A) 12! n 1 (n — J) 13! 


n 


... + (-l)'»- 1 


« ! 


, -r! + rr4 1 ! + -" +( - ,, ’' 1 ;ri 


\ 


As n-+ co this probability —> I— e 1 = 0‘632. 

Ex. 5. Six cards are drowi from an ordinary deck , with rep¬ 
lacement. What is the probability that each of the four suits will 
be represented at least once am mg the six cards l 

Sol. Let A ={lhe appearance of all the suits} 

B = (non-appearance of at least one of the su ts} 




Mathematical Statistics 


Then clearly D = the complement of the event A i.e. A. 

Since either A or B is certain to happen, 

P(.4 + 5) = l 

and since A and B are mutually exclusive, 

P(A + B)=P(A) + P(B)= 1 
-and P(A) = \-P(B) 

Thus, if we can find P(B), P(A) can be determined at once. 

To get P(B), we classiby the possible outcomes favorable to B 
into four sets : 

B\ — the set of all outcomes in which spades are absent. 

B 8 — ... ... hearts ... 

— ... ... diamonds 

B x — ... ... clubs 

These sets are overlapping; an outcome which consists of only 

spades and hearts falls in B A and in B x Clearly 

P(B)=P(B l UB i UB 3 \jB x ) 

and P(B) =s P K B i )-'2>P{B i B i )+2 P( B t BjB r )—P(B x B.BiBJ 
in which the sums are taken over all combinations of the subscripts 
P(B l ) = the probability that a spade will nit appear in the six 
draws. 

==(3/4)*, and the value is the same for all Bi 
hence £ P Bi — 4c i (3/4j*. 

P(B j, BJ — the probability that neither spades nor hearts will 
appear in the six draws = (.V) c and is the same tor all 4c a = 6 pairs 
hence 2 />(£,/?>)=4c 2 (*)«. 

Similarly 2 P(B,BB r )=- (Ac,) (i) fl 
and P(B.B z B;M = 0 

since the simultaneous non-appearance of every suit is im¬ 
possible. The required probability is, therefore, 

/■( ./)_ l — 4c,t-2 1 " + 4r a (£)*-4 c 3 (*)« 

= I — 4(i , )°-i-6(‘.) B —4(i) R 
gs’3$1 


Fa. 6. What is the probability that at least one of the players 
Vi a bridge game will gel a oomph te suit of cards ! (Oeihi 65] 

Sol. Let A x {the e\ent that the first player gets a complete 


A = { 


suit of cards] 
second player gets ... j 
third ... ... } 

fourth ... ... > 





Theory of Probability 


107 


.\ P(A X ) = T7 ^-, P(A x A t ) 

JZC 12 


4 

52c,3 


3 

3^,3 


P(A lA ,A 3 ) — 


59c,, ’ 26cj3 


PlA^A^t) 52fu • 3 y fjj • 26cij ■ ,3^ 

Hence 

P ( 14 1 + A - + A 3 + A1):=4 c ' ■ 5^7, 4c = 523^39^ 

4x3x2 _ 4x3x2::! _ _ 

+ r3 52c, 3 X 5 Jc l3 X 26ci3 , D2c 1 3-t-39c 1 3X26c, 3 X 13c la 

since the event A 1 can take place in 4c x ways: 
the event A X A % in 4c f ways; the event A X A 2 A 3 in 4 c 3 ways 
and the event A 1 A 2 A 3 A 4 in 4 c A ways. 

Required probability 

_ 16.13 !. 39 ! — 72.(13 !) 2 .26 ! + 72 .( i 3 !) 4 

52! 

= 2 . 52 . 10 - u . 


Ex. 7. In a random distribution of r balls in n cells' find the 
probability that exactly m cells remain empty. Hence or otherwise 
find the probability that in n random throws of an unbiased six¬ 
faced die all faces have come up at least once. (Bombay 63J 


Sol. Let A k denote the event that cell number k is empty. 
(fc=l,2,...,«)• In this event all r balls are plajed in the remaining 
(«—1) cells, and this can be done in (n — l) r different ways. Simi¬ 
larly when two calls are left empty, r balls can be put in the 
remaining (n — 2) cells in (n-2) r different ways and so on. Now, 
since r balls can be randomly distributed over n cells in n r different 


ways, 

(n —1 / (n—2) r (/i —3/ 

pi== 

As one empty cell can be chosen in n c l ways, two empty cells 
in n c % ways, three empty cells in n c 3 ways and so on. Hence 

S X = E Pi = n c l (1-1 InY 
S.,=Z Pi,= n c 2 ( 1-2 InY 
■S 3 = 2 Pnk= n c 3 (1—3 /n) r 

and so on. 

The probability that at least one cell is empty is given by 
P{A x UA 2 U U A n )=S i — S-2 + S a — ... + ( —l) n+1 S„ 



108 


Mathematical Statistics 


Hence the probability that all cells are occupied is given by 

P(UkAk) + P{ U?Ak) = 1 

or P(n*Z fc )-l-/ , (i4iLM*U -U^ n ) [by De Morgan’s Law] 

= \-S 1 +S t -S 3 +...-(-l)*+ 1 S n 

or P(J 1 n^n...nJ»)=i-"c 1 (i-i//0 r +"c 2 (i-2//i)*- 

—"c 3 (l-3//i) r +... 

or p 0 (r,n)=2 (—1)° "c v (1 -v//?) r ...(1) 

tj=0 

Consider now a distribution in which exactly m cells are 
empty which can be chosen in n c m ways. The r balls are distri¬ 
buted among the remaining n—m cells so that each of these cells is 
occupied ; the number of such distributions is ( n—m) r p 0 (r, n—m). 
Divining by n r we find that the probability of exactly m cells 
remaining empty is 

Pm ( r , n)= n c m -— Po (r, n—m) 


n-m 


c m (1 —mjn) r 2 (—i)«»»-»c v /1 

t?®0 ^ 


rt-m 


- fi 


Cm (1— m/n) r 2 (—\)v 

t?“0 


l 1 


V 

n—m J 

__m-f v\ r 
» ) 


( ■-? ’ 


*c m 2 (-1)' 


n-m 

2 

v -0 


For the second part we argue as below. 

Ler Ai,A St A 3 ,A 4 ,A s ,A 0 denote the events that the points 
5, 2, 3, 4, 5, 6 do not appear respectively. 

We have to determine 

P(Ai n a 2 n /I3... D/^g). 

Either all the points 1,2, 3, 4, 5, 6 appear or at least one of 
these does not appear in n throws and hence these being compli¬ 
mentary events we have 

P[A l C\A^...nA Q ] + r[AiyjA. 2 \j ...LM U ] = 1 

4.e. Jj=i-/ , ]/i 1 u^ 2 u...u.4 ti ] 



Theory of Probability 


109 


= l-[27 P{Ai)-27 P(AiA,) 

4-2 P(AiAjA k )-...] 
[Z pi — 2. 7 p t ) + Z p i) i —... | 

Now suppose if 1 does not appear in n throws ; its probability 
is (5/6)”- Any point from among 1, 2, 3, 4, 5, 6 can be chosen in 
*c.i ways and hence 27 p, = V, (5/6)”. 

Next suppose that I and 2 do not appear in n throws whose 
probability is (4/6)«. Any two points from among 1, 2, 3, 4, 5, 6 
can be chosen in 6 c 2 ways and hence Z p i) = (i c 2 (4/6)”. 

Similarly, Z p i)k = 6 c 3 (3/6)" 

Z Pun =V 4 (2/6)” 

£ pm-im= n c& (1/6)” 

and P(A l A 2 A 3 A 4 .../! c ) = 0, since it is impossible that none of 
the points 1,2, 3, 4, 5, 6 appear in n throws. 

Hence the probability that all faces have come up at least 
once in n throws is 

1-Vi (5/6)"4-°c 2 (4/6)” —®r 3 (3/f)”+V 4 (2/6)”-«c 5 (1/6)”. 

Ex. 8. An urn contains n tickets bearing numbers from 1 to 
n, and m tickets are drawn at a time and returned before the next 
drawing is made What is the probability that in k drawings each 
of the numbers 1, 2 will appear at least once ? [Bombay 60) 

Sol. Either all the numbers 1,2, 3occur in k drawings 
or at least one of these numbers does not occur. 

Let A lt A Zt . A„ denote the events of the non-occurence of 
1, 2respectively. 

Hence 



= Pi - 

|_<-i 


Z po-\-Z puk 


-l 


= l-(^,-^ 2 4-5 3 -...4-(-l)” M S n ] 

where S lt denote the sum of the probabilities of eventa 

taken one, two, , at a time. 



110 


Mathematical Statistics 


Now Pi = probability that the ticket bearing number / does 

not appear in one drawing 

— Cm/ tm 

n — m 


n 



since tickets are returned betore the next drawing is made. 
Similarly, 



_ nt n — I ) In — mV j n — m— I\ fc 

■ nr v « / \ / 

and so on. 

required probability 

-rn \ l: I n — m — 1 N fc _ 

7i / \ /i— l ) 

Theorems. If AC B, then P(A)<^ P(Bl, 

Proof. We may decompose B into two 

mutually exclusive events a-> follows : 

B=A\J(BC\A) 

Hence P ( B) = P t.1) -f P(Bf \ A)^ P(A), 



= 1 - V, 


(n — mY , n(n—\) /n 
\ n ) " 1.2 \ 


since P{BC\A )>0 from property. 

4 5 Finite sample space. 

Here we shall d al with experiments for which the sample 
space .V consists of a finite number of elements. That is, we suppose 
that 5 may be written as S = {ii\, a,, •••, at). The event/4 = {<?<} 
consisting of a single outcome is called an elementary event. 

To each elementary event \a,\ we assign a number called 
the probability of [a t \ . satisfying the following contitions : 

(1) i — 1-2,.. ,A 

(2) ...+/>*=! 

Suppose that an cunt /I consists of r outcomes, l<r<A, say 

,-f — , fljj,”" , ajr} 

where/,./-j, .-, ./r represent any r indices from 1,2, ....A'. Hence it 
follows from lHq.(4) ot §4 4 that 


Pi A) 


/ 


t • • • ' PJr* 


...( 1 ) 



Theory of Probability 


111 


That is, the probability of an event A consisting of r outcomes 

equals the sum of the probabilities of the vauous individual out¬ 
comes making up the event A. 

In order to evaluate the individual p ,'s some assumption 
concerning the individual outcomes must be made, for example 
suppose three outcomes o u a, and a 3 are possible in an experiment 
and further that a x is twice as probable to occur as a,, which in 
turn is twice as probable to occur as a 3 Hence pi=2p, and p, = 2p 3 . 
Since Pi+p 2 +p 3 = 1, we have 4p 3 +2p 3 +p 3 — 1 
which => p 3 = 1/7, p 2 = 2/7, Pl = 4J7. 

Ex. Three men toss a coin in succession for a prize to be 
given to the one who first obtains head. Show that their chances of 
winning are All, 2/7, J/7 respectively. 

Sol., Let three men be A, B and C B's chance in any round 

1S * of A s chance, and C’s is £ of B's chance. If x—A's chance :n 
the long run, we have 


Thus 




x ~ y* anc * the respective 


chances are 



1 



4 51. Equally likely outcomes 

The most commonly made assumption for finite sample space 
is that all out comes are equally likely. 

If all the k outcomes are equally likely, it follows that each 
Pi Ik, for then only the condition Pl + ...+p k = 1 . 

Hence for any event A consisting of r outcomes, we have 

PI A) = * n which E can occur favoura ble to A 

total number ol ways in which E can occur 

In examples of statistics we are concerned with choosing at 
random one or more objects from a given collection of objects 
Let us precisely define this notion. 

To choose one object at random from the N objects say 

means that each object has the some probability of 
being chosen. That is, 


Prob (choosing c t ) — 


1 _ 

N ’ 


/=!, 2,.../V. 


To choose two objects at random from N objects 

each pair of objects (disregarding order) has the same 

of being chosen as any other pair. 


means that 
probability 



112 


Mathematical Statistics 


To choose n objects at random (n^N) from the N objects 
means that each «-tuple, say an, a i2 . a in , is as likely to be 

chosen as any other /i-luple. 

The exp-esson “at random' will be used only with respect to 
an equiprobable space. 

F.x 1. Three groups of childern contain respectively 3 girls 
and 1 “to\\ 2 girls and 2 boys, 1 girl and 3 boys. One child is selected 
at random from each group. Find the chance that the three selected 
compise 1 girl and 2 boys. [Agra 55 5 

Sol. The event may happen in any of the mutually exclusive 


ways : 

girl-bov-boy: boy-girl-bcy; boy-boy-girl. 

The probabilities of these three are : 

3 2 3 . 1 2 3 . 1 s. 1 

4 • 4 • 4 * 4 • 4 * 4 > 4 * 4*4 

respectively. 

The lequired probability is their sum 

-^[u+6f2]=y. 

Ex. 2. A room has three lamp sockets. From a collection 
of 10 light bulbs of which only six are goo I. three bulbs are sele¬ 
cted at random and placed in the sockets. What is the probability 

that there will he light in the room ? 

Sol. Numbers of ways in which three bulbs can be selected 

out of 10 light bulhis— 10r 3 •• 0) 

There will be light in the room, if at least one good bulb is 

selected The number of ways in which this can happen 

=nfir 1 x4r s +6r a x4f| 4-6r 3 X *o 0 •• ( 2 ) 

Therefore the required probability is 

6r,v 4r.. T 6<y, '< 4r, - 4- 6r n X 4r„ = 29 

I ()r 3 30' 

Ex. 3. Two dice are thrown. What is the probability of scoring 
either a double, or a sum greater than 9 ? 

Sol. The outcome 'pace S con'ains of 36 pairs (/, j), where 
/ run independently from 1 to 6. We make the assumption that 
the events are equiprobable and so attach a probability to each 

elementary event of 1/36. 

Let A={( 1, 1), (2, 2), (3, 3 ), (4, J), (5, 5). (6, 6)}, 
n={( 4, 6\ (5, i), (6, 4), (5, 6). (6, 5), (6, 6)} 
and so .4fi#={(5, - 41 ), (f\ b)l. 


theory of Probability 


113 


we thus have 

P(A | S) = P(D | S) =6/36, P(ACB | S) = 2/36 
■and so P(A+B | S) = P(A | S)+P{B | S)-P(ArB \ S) 

= 1 , \ _1_5_ 

6 6 18 18' 

Ex. 4 An integer is chosen at random from the first two hun¬ 
dred digits What is the probability that the integer chosen is divi¬ 
sible by 6 or 8 ? (Agra 67] 

( Hint /4 = {the integer is divisible by 6} 

B={ the integer is divisible by 8 }. 

Since 6 X 33 = 198 and 8 x 25=200, the set A contains 33 inte¬ 
gers -f ti e set B, 2 s integers. Hence />(4) = 3}/200, P(B) =25/200. 

Also AB={ the integer which is divisible by 6 and 8 i.e. by 
their I c.m. 24}. 

The number of integers divisible by 6 and 8=greitest integer 

, , 200 c 
less than - =8 
24 


.*. P(TS = 8/200 and P(A -f B) = 33/2. 0+25/200-8/200 = }] 

4-5.2. Expectation. If p represents a person’s chance of su¬ 
ccess in any venture and M the sum of money which he will receive 
in case of success, the sum of money denoted by pM is called his 
expectation. 

If a person in a bet wins a with probability p and loses b 
with probability < 7 , then his expectation is denoted by (pa—qb). 

Ex. 1. Two players of equal skill A and B are playing a set 
of games , they leave off playing when i wants 3 points and B wants 
2. If the stake is Rs lb, what share ought each to take ? 

Sol. The set will necessarily be decided in 4 games A may 
win 3 games in exactly 3 games or 4 games. Since A and B are of 
equal skill, the chance that A wins any game is equal to his chance 
of losing it, each being 

The chance that A wins in 3 games is \ X | x f=|. 

To win 3 games in 4 games, A must win the last game and any 
of the other 3. The chance for this is 

3<-« (i ) 1 (l)xi 

I o 


• • 


A’s chance of winning is 


, _3 = 5_ 
8 16 16 * 


If the stake is Rs 16, his share thus, ought to be Rs 5 and so 
B *8 share must be Rs 11- 



Mathematical Statistics 


II4 

Ex. 2. A makes a bet with B of 5s to 2s. that in a single 
throw with two dice he will throw seven before B throws four. Each 
his a pair of dice and they throw simultaneously until one of them 
wins, equal throws being disregard d. Find B's expectation. 

I Agra 60, 66} 

Sol. The chance of throwing 7 with two dice is 6/62 for 7 

can be made up as (I, 6 ); (2; 5); (3, 4) each with two possibility 

and the chance of throwing 4 which can be made up as (1,3); 

(2, 2) with two and one possibilities only is 3/62. 

Thus A’s chance in each trial is double of B's. Now we requrc 
B's expectation in the long iun, the throwing being continued 
until one or other ». f them wins. 

Let x = B's chance on this supposition, then clearly 2 x=Asr 

ehance and therefore 

*4-2.*= 1 => *=1/3 

ZTs expectation = £ offs—2/ ' of 2.s = 4<f 
4 6. Conditional Proabability Definition. Assume thi.t A 
and B are two events in a sample space S such that P(I?)>0. Their 
the conditional probability of event A based on the hypothesis 
that event B has occurred is defined by the following relation , 

p (*'B)=-£gr- 

Whenever we compute l J (A I B) we are essentially computing 
P{A) with respect to the reduced sample space B, rather than with 

respect to the original sample space S 

Simileriy P(B I = > Prided that P(A)> 0. 

Illustrative Examples. 

Fx 1. An urn contains 6 red and 4 Hack balls, fwo balls are 
drawn without replacemen,. What is the probability that the second 

ball is red if it is known that the first is red ■ 

Sol. Let P = {first ball is red}; /! = {feeond ball is red}. 

'[here are JOrowavs of drawing two balls from the urn, am) 
so the sample space 5 contains 10 c 2 points each with probability 
I/(IOf 2 ). I he number of ways of getting two red balls is 6 r„ and 

so 

P(/lfl) = 6 co/ 10 <v=£ 

P k B) = the probability that the first ball drawn is red = 6 /K), 
Thus P(A | B) - (l/3)^j=5/9 




Theory of Probability 


115 


Note. P{A | B) can be computed directly by considering that 
is the first ball is red, this leaves 5 red and 4 black balls in the urn, 

and so 

P(A I 5) = 5/9. _ 

Ex. 2 Two fair dice are tossed. Find the probability that 

the sum of the two dice is 10 when it is known that he first dice shows 

a larger value than the second dice. Also find the probability when 

the conditions are inverted. 

Sol. Let us denote the outcome as (xi, *«), where .v t is tne 
outcome of the ith dice, / = 1 , 2 

Let /* = {(*!, xt) 1 *i+*2=10}, B={(x i, x t ) | *i>x 2 j 

Then we have to calculate P(A | B). 

The sample sp ce S may be represented by the following array 

of 36 equally likely outcomes. 

tl,l) ( 1 , 2 ) ... (:, 6 ) 


S=\ (2,1) (2,2) 


• • ♦ 


( 2 , 


(6,1) (6,2) ... (6,6) 

Thus A = {( 5, 5), (4, 6 ), ( 6 , 4)}; (three equally likely eases) 

£ = {(2, 1), (3, 1), (3, 2),..., ( 6 , 5)} (fiften equally cases) 
Since ( 6 , 4) is the only outcome out 36 equally likely outcomes 

such that 6 + 4=10 and 6>4, 

I 


Also 


^ B > = 36 - 


Hence PV \ W = ~ TSjj6 15' 

P(AB) 1/36 . PM , 3 

Similarly, P(B l A)— *' nc * P(/,) = 36* 

Note. We can directly compute P(B I A) = 1/3, since the sample 
space now consists of A (that is three outcomes), and only one of 
these three outcomes is consistent with the event B. 

Ex. 3. There newspapers A, B, C are published in a certain 
city, and a survey shows that of the adult population ; 20% read A, 
16% read B, 14% rtad C, 8% read both A and B , 5% read both A 
and C, 4% read both B and C, and 2% read all three. What percent 
age reads at least one of the papers ? Of those that read at least 
one , what percenage reads both A and B ? [Delhi 55, 6 S] 

Sol. The data given in the problem can be exhibited as 

follows : 



MafTiematical Statistics 



P(A) = probability that paper A is read=20/100 

P(B)= 16/100, P(C) = 14/ iOO, PM2?)=8/10U, P^C) = 5/100, 

P(BC)=4/ 100 and P{ABC)= 2/100. 

We have to determine here 



P{AUBuC) t P[A C\ B\A\J B\jC) 
P(AUBUC)=Z P(A)-Z P(AB) + P{ABC) 

= (20 + 164-14)/K)0- ( ^^-f 


7_ 

100 


_35 

10 (> 


Hence 35% of the adult population reads at least one of the: 


papers. 


(ii) P[{AC\B) | (/1U5U0] = 


PfMn g)nMuffu C)] 

P(AnBUC) 


__ P\AHB] _ 8 
P(AUB[JC\ 35' 

Thus, of those that read at least one, 8/35 X 100 = 225% read 
both A and B. 

Remark. It is a simple matter to verify that P(A j B), for 
fixed B , satisfies the various postulates of probability. 

That is, wc have 

(1) 0<PM | B)<\ 

(2) P(S I B)= 1 

(3) P{A x \jAi | B) = P(A l | B) + P(A 2 | B) if A t n T a =<£ 

(4 ) P(A x \jA 2 \j... | B) = P{A X | B) + P(A.j | B) + ... 
if AiOA/ — ^ for ij£j. 

Thus we have two ways of computing the conditional probabi¬ 
lity P(A 1 B) : 

(a) Directly, by considering the probability of A with respect 
to the reduced sample space B 

(b) Using the definition P(A \ B) = P ~~ , where P(AB) and 
P(B) are computed with respect to the original samp’e space S. 


4-61. Independent Events. 

Definition. The two events A and B are said to be mutually 
independent if 


% 


P(A | B)=P (A) 
P(B | A)=P(Bi 


...(I) 



«Theory of Probability 


in 


Note that for mutually independent events 

P(AB) = P(A) . P(B) ...(2) 

Remark. For two independent events, the equations 
:p(A\ B) = P{A\ P(B I A) = P(B) hold, but when P(A) = 0 or 
P(B) = 0, then P(B | A) or P(A | B) is not defined. For this 
reason some authors prefer to define rnutaal independence in such 
-a way that equation (1) above remains valid for all circumstances, 
■including 

P(A) = 0 P(A) = 1 

P(B)=i) P(B) = \ 

For this purpose, the following defining equations are 
•suggested 

P{AB)=P{A) P(B) P(A t) = P(A) P(B) 

P(AB) = P(A) P B) P(AB) — P(A) P{B) 

Independent events are more specifically called statistically 
■independent or stochastically independent. 

The events A, B and C, defined on the same probability space, 
<are said to be statistically independent if 

(3) P(AB)=P(A)P(B) t P(AC) = P(A)P(C), P(BC) = P(B)P(C) 

(4) P(ABC) = P(A) P(B) P(C) 

If (3) and (4) hold, then 

P(A | 5, C)^P(A | B) = P(A | C) = P(A) 

(5) P(B | A , C) = P(B | A) = P(B | C) = P(B) 

PkC I A , B) — P{C | A) = P(C | B)=--P k C) 

Conversely, if all the relations in (5) hold, then all the 
relations in (3) and (4) hold. 

Three events may be pair-wise independent but not independent 
■themselves , for example if we toss a pair of fair coins then the 
•sample space S= {////, HT t Til, TT). 

Let /l={head on the first coin} = {H/I t PIT } 
i? = {head on the second coin } = {HII, TPT } 

C={head on one coin only) ={HT, Til) 

‘we observe 

P(A)^P(B) = P(C) = l = l and 
P(AB) = P({PIH)) = \ = P(A) P(B) 

P k AC)^P(\HT)) = \ = P{A) P(C) 

P(BC)=P({TP/}) = i = P(B) P(C> 

%ut P(ABC) = P('IA=0^zP(A) P(B) P{C ). 



jj R Mathematical Statistics 

Thus, the events are not independent, though they are inde 
pendent pairwise. 

Illustrative Examples 

Ex. 1. An urn contains four iickets with numbers 112, 121 , 
211, 222 and one ticket is drawn. Let At (i= 1, 2, 3) be the event 
that /"> digit of the number of Ibe ticket drawn is 1. Discuss tlte 
independence of the events A i, A 2 ond A 3 . 

Sol. Here P{AA = \ = \ ; P(A 2 ) = l = \ ; i > (^ 3 ) = * == t 

P(A l Ao) = l, P(A 1 A 3 ) = h P(A 2 A 3 ) = \ 

Since P{AiA 2 ) = \ = P{A\) P{A 2 ) 

P(A z Ai) = k = P(A 2 ) P(A 3 ) 
and P(A 1 A 3 ) = } = P(A l ) P(A 3 ) 

The events A u A 3 and A 3 are pairwise independent. 

Also P{AiA 3 4z) =Prob {all the three digits in the number 

are l’s} 

= 0 

^PiA,) P(A>) P(d 3 ) 

This implies that though A lt A 3 and A 3 are pair wise inde¬ 
pendent, but are not mutually independent. 

Ex. 2. Prove that if A, B and C are random events in a 
sample space and if A, B t C are pairwise independent and A is 
independent of (£UC), then A t B, C are independent. [Delhi 65] 

Sol. Wc are given 

P(AB)=P'A) P(B) 

P(BC) = P{B) P(C) 

P(AC) — P(A) P(C) 

P[A(BUC)) = P(A) P(BUC) 

Now P[A(BuC)] = PiABUAC) 

= P(AB) + P(AC)-P{ABC) 

= />(/!) P(£)+ P(A) P{C)-P(ABC) ...(1) 

and P(A) P(BUC) = P(A) [P(B) 4- PyC)-P{BC)\ 

= P(A)P(B) + P k A)P{C)-P(A)P{BC) ...(2) 

(1) and (2) => P(ABC) = P{A)P { BC) = P(A)P(B)P(C). 

Hence A, B, Care mutually independent. 

Ex. 3. An event £, is known to be stochastically independent 
of the event i E 2i E 2 + E 3 and E 2 E 3 . Show that it is also stochastb 
cully independent of E 3 . [Delhi 64] 

Ex. 4. If A and B ate two independent events , show that 

7\ nnd B are also independent. 



theory of Probability 


119 


Sol. Since (AU~B) = AC\B (by De Morgan's law) 

=► P(A U B) + P(7inB) = P(S) = \ 

=> P(AB) = \~P(A\JB) 

= 1 -P(A)-P(B) + P{AB) 

= i -P(A)-P{B) + P(A)P(B)i 

Since A and A being independent 

P(AB) = P(A)P(B). 


P(AB) = {l-P(A)} (1 -P{B)} 

= P(A).P(B). 

Ex. 5 (a) An event A is known to be stochastically independent 
vf the events B: Bu C, and BC\C. Show that it is also stochastically 
independent of C. (B. Sc. Delhi 71] 

(b) A and B are independent events both of which are neither 

certain nor possible. Prove that the opposite events A and B are 

independent. l»- Sc ' Dclhi 611 

4 62. CompIimentatioH rule. For any two independent events 

A and B , 

P{A\JB) = \-P(A)P(B) 

.and generally for n arbitrary independenr events A u A 2 ,...,A n 

P(Ai\jA z U ...KjA^^X- P(A x )P(A % )P(A n ) •-&) 

‘Proof. Since (A U B) U (A U B) = S 

iAUB)U(AnB)=S {Tu~B=AnBby De Morgan’s law] 

P(AUB)-\-P(ADB)=--P(S) = i 

P(AUB)=\ -P(aB) = \-P[A)P{B) 
independence between A and B implies independence between 

~A and 5] 

If there aTe n independent events A lt Ai, ..,A„ 


lier.ee 

'OT 


(U i A i )U(U<A i ) = S, but UtAi^CtAi 
P(U<^)4-AO < 4 < )=/ > (S)=l 
■PUxVAtVJ ...\jAn) = \—P(A i )Pi < A*) .. P{A„). 



120 


Mathematical Statistics 


Remark. For any three events A, B, C 

P(A + B+C) = 1 —P(A)P(B | A)P(C f AB) 

and P{A+B + C)=\-P{A)P{B | A)P(C | AB) 

where P(A) denotes the probability that A does not occur. 

Por P(A UB\JC)+ P(AjBuC) =P(S) = I 

and therefore P( 4 UBU C) = l ~p { AC)Br\ ~) 

[Using De Morgan’s lawj 

=*I -P(A)P ( B | A)P(C | AB) 

Similarly, we can have the other result. 


Ex. 1. The probability of n independent events are p u P 2 ,...,Pn- 
find an expression for the probability that at least one of the 
events nil/ happen. (Delhi 66/I A s< 5J{ 

[Hint. The chance that all the events fail is 

0“Pi) (' ~Ps) (I ~p 3 ).. (I -r.A 

and except in this case some one of the events must happen ^ 
hence the required chance is 

1-0-7>i) U-Ps) ■ (I -P„)J 

Ex. 2. p is probability that a man aged x will die in a year 
Find the probability that out of n men A t> A 2 ,...A„ each aged x, A x . 
wilt die in me year and be the first to die. [Punjab 58, I.A.S. 54} 

Sol, 

Hie chance that a given man dies in the year = p 
I he chance that a given man does not die in the year=l —p 
7 he chance that none of the n dies in the year— (1 — p) n 
The chance that at least one man dies in the year= 1 — (1 — p)w r 

Since the chance that A x is the first to die is obviously \!n 
and since this is independent of the chance that at least one nun- 
dies in the year, the required chance = I in [I —(1 — 

Ex. 3. A can solve 75% of the problems in this book and B 
cm, solve 90%. What is the probability that either A or B car 

solve a problem chosen at random. u, . . ‘ 

[Punjab ( 6} 

[Hint. P(A\J B)^\~p'~t)P(Ti) 

— ! — £• 10 = 37/40] 


121 


Theory of Probability 


4*7. Theorem of Multiplication. Tee multiplication rule for 
the ca*e of two events A and B can be obtained through the 

definition of the conditional probability. 

P(AB) = P(A)P(B | a) 

P(AB) = P(B)P(A [ B) •• (D 

This rule can be extended to the case of more than two events. 
For instance, for three events A f B, C we can write 

P[ABC) = P(AB)P(C | AB) 

= P(A)P(B I A)P{C 1 AB) -(2) 

In general, we may show by induction that 
P(A,A,...A t ) = P(A,)P(A, | A,)P(A 3 I A l A,)P(A t \ A,AM 

P(Ak | 1) •••(3) 

and there are k I such relations, which may be obtained by 
permuting the letters on the right-hand side. 

When a finite number or a countably infinite number of 
events A u A 99 ... t A k are mutually independent, we have 

P(A u A 2 ,...,A k ) = P(A l )\’[A 2 )...P(A k ) -W 

Note. The multiplicative law of probabilities is particularly 
useful in simplifying the computation of probabilities for com¬ 
pound events. A compound event is one that consists ot two or 

more single evenls, as when a die is tossed twice or three cards 
are drawn one at a time from a deck. 


Ex. 1. Show that 

P( A B)< P(A )< P{ A + £)< P{A) + P{ B) 

Sol. By the compound probability theorem 

P(AB)=‘P(A)P(B | A) 

Now since P(Z?|/4X1, P(AB)^P(A) 


[Delhi 64, 67] 


...( 1 ) 


Again 


and 


P{A + B) = P(A B) + P( A B) + P( A B) 
P{A)== P(Ab) + P(AB) 


P(A + B) = P(A) + P(AB) 

This =>PiA)<P\A 4 B), since P(A B )^0 ...(2) 

Also P( A 4- B) = P( A) 4- P(B)- P( AB) 

This p(A + B)^P(A)-\-P(B), since P(AB\>0 ...(3) 

The required result is obtained from (I). (2),&(3). 

Ex. 2. Comment on the statement "The probability that A 
passes an examination is The probability that B passes an 



*22 


Mathematical St at is lies 


examination is also $. Therefore it is certain that either A or B 
will pass. 

[Hint. P{ A +B) = P(A) + P(B)-P(AB) 

or P(A + B) = l-P(A)PCB) 

= 1-1 

Since the events A and B are not mutually exclusive but they 
are independent. The suggestion that P(A+B) = 1 is certainly 
wrong, j 

Ex. 3. Define independent and mutually exclusive events. Can 
(wo event's he mutually exclusive and independent simultaneously ? 
Can the mutually exclusive events he regarded as a special case of 
dependent or ind pendent events ? [B.Sc. Agra 63, Deihi Hon’s 57] 

[Hint. If ADB=<f> i.e. if A and B have no sample points in 
common, the two events A and B are mutually exclusive. A and 
B are mutually exclusive if and only if they can not occur 
simultaneously. 

An event A is said to be independent of an event A, if the 
probability that B occurs is not influenced by whether A has or 
has not occurred. 

Two events can not be mutually exclusive and independent 

simultaneously. For mutual exclusiveness implies P(AB) = 0, and 

independence implies P{AB)=P(A)PlB)^ 0, since P(4)>0, P(B)>0 
Again P(A).P < B)> 0 => P(A B)=P(A). P(B)> 0 
=> A and B are not mutually exclusive. 

If A and B are mutually exclusive, then the occurrence of A 
implies the occurrence of B and the occurrence of B implies the 
occurrence of A. Thus the occurrence of A or B definitely condi¬ 
tions the occurrence of B or a.] 

Lx. 4. It is 8 : 5 against a husband who is 55 years old living 
till he is 75 end 4 : 3 against his wife who is now 48, living dll she 
< s 68. Find the probability that ( a ) the couple will be alive 20 
years hence, (b) one at least of these will be alive 20 years hence. 

(Agra 61, Punjab 58] 

[flint. P{ f) = ~, P(AB)^= P(A)P(B) 

P(A 4-5) = P(A) + P(B)- P( A /?)=--,- 3 _ L?= 591 

13 / 91 91J 


123 

Theory of Probability 

4.8 Marginal Probability Suppose the sample space S which 
consist! of n points with probability l/r> is partitioned into a 
joint (mutully exclusive) subsets A u A 2 , A t . e 1 . *’ •* * • 

be another partition of S into , disjoin, subsets. If m. of the » 
out comes have the attributes A, and Bi, the n points in S may 
then be classified in a two way table as blow. 



obviously P(A f B { ) — ^ 

If we are interested in only one of the criteria of classification, 
say A and indifferent to the B classification. 


then P(Ai)—£ 

4“1 a 


or P(A { )=£ P(AtBj) 


This is called a (marginal probablity), and the term marginal is 
used wheneveJ one or more criteria of classification are ignore 
The marginal probability of Bt is 

P{B } ) =ZP{AiB,) 

In point set terminology, we have partitioned the sample 
space S into r s disjoiet subsets where the general subset is denot 

ed by A t n Bt). 

Now 4 ,=wn«,)uwn«..u (-<.n A) 

where (A,n A)f1(4,nA)=* wherc M k - 

Hence we may apply the addition property for mutually 

elusive events and write 

P(Ai) = P^AtHBAA- P{AiC\B t )- i r ••• + P(A t nB) 

P(A t )=2 P(Ai, Bt) . 


or 






i 24 


Mathematieal Statistics 


Partitioning of the sample space S' 

Definition, The events B lt B a ...,2?* represent a partition of 
the sample space S is 

(a) BiC\B,=t for all i^j 

(b) U B, = S 

<=i 

(c) />(£,) >0 for all. 

In words the above relations can be expressed as: ‘with the 
performance of an exqeriment E one ond only one of the events 
fti occurs,. For example when a die is tossed, the events ^ = {1, 2} 
#•-> = {3, , 5,} and B 3 —{6} would represent a partition of the sam¬ 
ple space, which C, = {1, 2, 3, 4,} and C t ={4, 5, 6,} would not. 

let A be some event 
with respect to S and let B u 
Bn, . B k be a patition of S 
The adjacent Vann diagram 
illustrates this for K=S. 

Hence we may write. 

U(ADB k ) 

where some of the sets A D Bj 
may be empty. 

Here all the events A n U Bl are pairwise mutually ex- 

chusive events, we obtain 

P(4) = P{Ar\B y ) + P(A(jB 2 + -f PlAnB*) 
or P(A) =P(A | B l )P(B 1 ) + P(A | B 2 )P( B 2 ) +-...+ P(A | B k )P(B k ) 

This result is known as the theorem on Total Probability. 

This result represents an extremely useful relationship. 

For often when P(A) is required, it may be difficult to comp¬ 
ute it directly. However, with the additional information that B, 

has occurred, we may be able to evaluete P(A | B,) and then use 
the above formula 

Example. Three factories 1, 2. 3 produce a certain item. It 

is known that l turns out twice as many items as 2, and that 2 and 3 

turn out the same number of items (during a specified production 

period). It is also known that 2 percent of the items produced by 1 

and by 2 are defective, while 4 percent of those manufactured by 3 

are defective AH the items produced are put into on e stockpile , 

and then one item is chosen at random. What is the probability that 
this item is defective ? 




Theory of Probability 


125 


Solution: Let 

A = { the item is defective } £, = {the item came from 1} 

# 2 = { the item came from 2}, 5 3 ^{the item came from 3 } 
we require 

P{A) = P{A | B V )P{B X ) + P(A | B 2 )P^B 2 )-\-P { A \ B 3 )P(BJ ..(I) 
Suppose 2 and 3 turn out .x items, then 1 turns out 2.v items 
and the total nunber of items turned out are 

2x+x+x=4.x 

Hence />{/?* ) = ^==^-* P(B 2 ) = P(B,) = ^=~ 

Also P{ 4 B l )=P(A\B i )=~- Q =0.Q2 

while / , (/tj/? 3 ) = 4/l()0= 0.04 

Inserting those values in (1), we obtain 

P(A) = ( 0 02) i + (0 02)i+(0 04)i 

= 0 01 + 0 005 + 0 01 
= 0.025. 

4 9. Bayes Theorem. We may use the above example to moti¬ 
vate another important result. Suppose that one item is chosen 
from the stockpile and is found to be defective. We wish to know 

the probability that it is was produced in factory 1, that is we 
require P(B X 'A). 

The concept and properties of conditional probability lead to 
a usiful result-Baye’s formula-which is used to modify probabilities 
in the light of additional relevant evidence. 

We have, for any two events A, B 
P(A I B)P(B)--= P{A B) = P{BA) = P(B \ A) P{A) 
and hence that, provided P(A)^0, 
n P(B)P(A | B) r „ _ 

W\A)=—— -for all D. .. (1) 

Suppose now that A is an event which can only occur in con¬ 
junction with one of k mutually exclusive events B it B if ... t B k . In 
these circumstances 

P(A) = P{B X A\JB. 1 A\J . U B k A) 

=P(B x A) + P(BiA)-\-...+P(B k A) 

(Since the Bi are mutually exclusive) 

= P(B 1 )P(A | B x ) + P\B 3 )PIA I /?*) + 

+ PiB t )P{A I B k ) -(2) 

Finally, replacing B by Bi it follows that 

P(Bi)P(A I B,) 


P(Bt | A)=. 


P(A) 



126 


Mathematical Statistics 


= P[B i)P(A | Bi) . 

k 

2 P(B'i)P(A ; Bi) 

<-i 




This result is known as Bayes* formula or Eaye's theorem and 
provides an expression for the probability of Bi conditional on A 
in terns of the sets of probabilities P(Bi) and P( •! I B{). iience the 
above formula gives us the probability of a particular Bi (that is, 
a “cause’*), given that the event A has occurred, In order to apply 
this theorem we must know the values of the P(Bi) % s. Quite often 
these values are not known, and this limits the applicability of 

this result. There has been considerable controversy about Baje s 
theorem. It is perfectly correct mathematically, only the improper 
choice for P(B t ) makes the result questionable. 

Rcturring to the question posed above, and now applying Eq. 
(3), we obtain 


PiB x | A) = 


(i)(0‘07) 


= 0-40 


Remark 

of situation. 


h)(0V2) -b (i (U U2) + (i)(0*04) 

The use of B tye's theoiem is in the following type 


Suppose an event A can be explained by a set of exhaustive 
and mutually exclusive hypotheses B lt B« t .. , B k . If we are given 
‘a priori’ probabilities P' B j), P(B 2 ), ... P(B k ) corresponding to a 
total absence ot knowledge regarding the occurrence of A and the 
conditional probabilities P(A \ B t ), P(A I B 2 ), f J (A I B k ), then 
we are required to form the \i posteriori’ probabilities P(B l | A) y 
P(B., | P(B L - | A). 


Illustrative Examples 

Ex. 1. The contents of three urns 1,2, 3 are as follows. 

2red, 1 black balls ; 3 red, 2 black balls ; 1 red y I black balls. 

One of the three urns is chosen at random and a ball is drawn 
from it The colour of the ball is found to be black What is the 
probability that it has been chosen from the third urn ? 

Sol. Let A be the event that a black ball has been drawn: 
Bi is the event that the /th urn has been chosen, i— 1,2, 3. 

Then P(B l ) = P(B z )~-.P /?,)= 1/3 

P(A | Pi A | A.)=s, P(A I B,) = l 


Theory of Probability 

Also, P{B Z | v4)=P{choosing win 3 [ black ball drawn} 

= P(B a )P(A I B 3 ) 

i P(B,)P(A I B t ) 

1 


127 


_ i-i -15 

W + « + i) 37- 

Ex. 2. Three urns are given : 

urn 1 contains two white, three black, and four red balls, 
urn 2 contains three white, two black, and two red balls , 
urn 3 contains four white, ore black, and one red ball. 

Ot e urn is chosen at random, and two balls are drawn from 
that urn. If the two balls happen to be white and red, what is the 
the probability that they were drawn from urn 3 ? 

Sol. Let P, = event of choosing urn i, /= 1,2,3. 

A = event of choosing a red and a white ball. 

we want P(P Z I A) 


Using Baye’s rule 

P{B, | A) = 


P(B 3 )P(A I B a ) 
I P(Bi)P(A\B t ) 


But 


Therefore, 


P(B 1 ) = P(B,)=P(B z ) = t 


i ir'-£ 


P(A | B 3 ) 


4 C i X l fi_ 4 


9c, 


i.A- 




21 

61 


Ex. 3. Three machines produce the same type of electrical 
component, 20/ of the total out put coming from machine A, 50°/ o 
from machine B and 30°U, from machine C. Te>ts conducted in the 
past show that 5% of the components from A and TUfrom each of 
B and C prove faulty. A component selected at random from the 
total output is proved to be faulty. What is the probability that it 


came from machine A ? 

Sol. Let us call Bi the event ‘component comes from 

machine A\ 



128 


Mathematical Statistics 


B 2 the event ‘component comes from machine B\ 

B a the event ‘component comes from machine C\ 

E the event ‘component found faulty’. 

We have to determine P(B, 1 E) 

Now, from the data, we may assign the following probabi¬ 
lities : 


P(E | B,) = 


P(B l ) = 


100 ’ 
20 . 
100 ; 


P(E | B,) = 

P( * 2 )= Too ; 


1 

100 *’ 


P(E | B 3 ) = 



P(B 3 ) = 


30 

RO* 


Therefore, 


P{B , | E) = 


P(Bi) P(E\B ,) 


£ P(B { )P(E | B t ) 

<=> i 


2 _ 0 _ _5 

mo' mo 

20 _5_ ^ 50_ TT 30 1 

iOO 100 100 ' 100 100 * 100 


— 5/9. 

Ex 4. In a holt factory . machines A, Z?, C, manufacture res¬ 
pectively 25, 35 ami 40 percent of the total Of their output 5, 4 and 
2 percent respectively are defective holts. A holt is drawn at random 
from the product ”/iil is found defective. What are the probabilities 
that it was manufactured by machine A or B or C ? 

[Bombay 64, Delhi 59] 

Sol. Call E the event ‘the holt is defective’ 

E t the event ‘the bolt is manufactured by machine A* 
E., the event ‘the bolt is manufactured by machine B' 
E 3 the event ‘the holt is manufactured by machine C\ 
We have to determine P{E X 1 E), PiE.. | E), P(E 3 | E) 

Now, from the data, we may assign the following probabilities 
P(E ! £ 1 ) = 5/100 = 0 05, P(E I ZT«)=0 04, P(E | E 3 ) = 0*02 
P(£ 1 ) = 0-25, P(ZT a ) = 0-35, P(£ 3 ) = 0N0. 

By Baye’s theorem 


P(E| , £)= 2M 


3 

V 
—* 


P(EfP(E 1 






Thus 


PiE, | E) = 


m 251(0 05) 

(0 25)(0 05) + (0 35)(0-b4)-H0 4h)(0 02) 



Theory of Probability 


i\29 


_125 

345 

Similarly P(E, | = 

345 

and P(E 3 | £)= 5 |L. 

Ex. 5 A bag contains 10 balls, either black or white , but it 
known how many of each. A ball is drawn at random and is 
"white. What is the probability that the bag contains at least 5 
svhite balls originally ? 

Sol The number of white balls in the bag can be either 0 or 
! or 2 or 3 or ... or 10. Let B 0 , B u B t , ...B l0 denote the eleven 
hypotheses with probabilities P(£ 0 ), P (£ w ). Let A be 

the event of drawing a white ball. 

If the bag contained n white balls. 


P M I = 

Therefore the probability of n white balls initially is 

P(j B n I A)= — 1 B JL L- 

S P(B n )P(A I B m ) 

n-0 


and the probability required is 


P 


10 

S 


fl¬ 



it) 

2 P (B„) 


n-o 



For evaluating this, it is necessary to know the a priori proba¬ 
bilities PW, P< fi i). • . p(*»>- 

For the determination of P(# 0 ), P(^i), P(£ i0 ), two reason¬ 

able courses are open to us. 

First. We assume that all values of n are equally likely, before 
the ball is drawn, that is, P(2?n) = l/10 for ali n. 

Second. We assume that the bag was filled originally by 
picking 10 balls at random from a mixture of a very large number 
of black and white balls in equal proportions. The probability of 
« white and 10 —n black is then proportional to 10c„. 

Hence the probability required is 


130 ’ 


Mathematic al Statistic* 


First Case : 


r<r 


n 


f 10 


P = 


'°n 

o 10 


f 

11 


10 


9 

ii 


=— = 0 82 . 


Second Case : 



3770 

5070 



Note that on the second assumption the a priori probabiFitj 
of at least 5 white balls is 


I 

910 


10 

z 


( l0 ('n) —0 62 


since each ball among the ten ones can be either white or 
black giving us 2 10 possibilities in all. 

Ex. 6. It is known that an urn containing altogether 10 ball, 
was filled in the following manner : A coin was tossed 10 times , 
and according as it showed heads or tails , one white or one black, 
ball was put into ihe urn Palls are drawn from this urn one at a 
time , 10 times in succession (with replacements) and every one 
turns out to be white. Find the chance that the urn contains 
not thing but white balls. [Lucknow 66] 

Sol. Let Bk denote the event that the urn contains k white 
balls, Ar = 0, 1, 10. In 10 throws if the coin shows A heads- 

and 10 — k tails, then 


P(B k )=' 0 c> (*)*(*) 10 '*= 10 c*. jio 

Let A be the event of drawing a white hair. 
Then P(A | A)=A'/10, k = 0, I, 2,...,10. 
?/e have to determine P(B, 0 I .-l). 

p (/? 10 1 A) = 


in 


Z P(BfiP(P(A ! B k ) 



Theory of Probability 




10 

_ 10 

_1 

10 

Z 

k 10 O; 

10.2 ,0—1 

2 9 


Ex. 7. There are N -f-1 urns each containing N balls such 
4hat the k ,h urn contains lc red and N — k whits balls where 
ifc= 0, 1, 2 An urn is chosen at random and n random draw¬ 
ings of a ball are made from it . the ball drawn being replaced after 
each drawing. If the balls drawn are all redshow that the pro¬ 
bability that the next drawing will also yield a red ball is approxi¬ 
mately when N is large. [Bombay 614 

n- f-2 

Sol. Let event >4 = {all n balls turn out to be red} 

Event £={the next drawing will also yield 

a red ball} 


To determine P {B | A). 

If the first choice falls on urn number k , then the probability 
of extracting in succession n red balls is (k/N) n . Since n red balls 
^n succession can be associated with any one of the N -f-1 urns, 


( 1 ) 


N n ('N-\-l) 

The event AB means that n- 1-1 drawings yield red bails, and 
'therefore 

i"«+:« + i ...+N n+t 


P{AB} = P{B} = 


...( 2 ) 


N ,rrl l/v + 1) 

The required probability is P [B \ A) —I* {B}'P {A} 

The sums in (1) and (2) can be considered Riemann sums 
approximately integrals, so that when N is large 

P {A %—Lr UY -r a-*--2. 

/ '■ ' N t -1 \N ) Jo « + 1 


and 


(*r=—.-A 

We have therefore for large N approximately 

P ' B 1 A) ^2 


4.10 Independend Bernoulli trials : By Bernoulli trials we 
mean random exprtments repeatedly performed with only outcomes, 
* success* or failure' with probabilities p and q ( q=l—p ) which rent- 


132 


Mathematical Statistic& 


ain the same from repetition to repetition. The probability of r 
successes in V independent relations of the experiment is ncrp T q n ~ r . 

Proof Since the n repetitions are independent the probability 
of success in the r specified repetition 1 and failure in the remaining 
n—r repetitions is p r (f~ 9 . The r specified repetitions in which the 
outcomes are successes can be chosen in wcy different ways, 
and these are mutually exclusive. 

Thus the probability of r successes ir» *n T independent trials is 

nc r p r q n ~ r , r=b, 1, 2 r . ji. 

Cor Probability of at least r successes in V independent 
trials with constant pobabihty of success is. 

27 ncxp X( i n ~ 9 

fit-r 

or equivalently is 

r-1 

(1— 27 nc m p x q n ~ 3 )' 

a—0 

since, we observe 

Z nc x p*q*~ x =(qA p)*=*1 
•~o 


Illustrative Examples 

Example I. An experiment succeeds twice as often as it fails. Find 
the chance that in the next six trials, there shall be at least four 
successes. [Agra 5%} 

Sol. Let p denote probability of success aud therefore q is- 
the probability of failure, so that p+q=\. Now by hypothesis. 

P — 2q and therefore pA q= I =>q= 1/3 ,/>=2/3 
Probability of at least four successes in the next six trials 

= 27 bc t p 9 q*~* 

where p—213. <? = l/3 

Required probability = 6c 4 p 4 ^ 2 -f 6r 6 <7 + 6c a q 9 

= (2/3)«[l5/9+10/9 + 4/91 
= 426/729 



'Theory of Probability 


V33 


Example 2 If m things are distributed among a men and b vvfl- 
*men, show that the chance that the number of things received by 

mien is odd is 

1 ib + a) m —(b — a) 

~2 - (b + a~, m (Agra 53 Lucknow 63) 

Sol. The probability that a man gets a thing 

= -XT = ^ (say:) 

a-\-b 

<and the'probability that a woman gets a thing 

b • JL , b 

a-\-b a-\-b 


= q , since —r *.+ 


= 1 


■a + b 

Probability that men receive r things and women receive m—r 
•tilings 

=™c r p r (r~ T , r = 0,l>2-. : ™ 

The chance that the number of thing? received by men is cdd 

='”c i pq n, - l + tn C s p*q m ~ 3 + m CsP*q m ~ S +' ■» 

« 1 (( q+pr~(q-p) n ‘] 

(a + b) m -(b-o) m 




(a + b) 


ill 


Exercises 


1. If'on an average 1 vessel in every 10 is wrecked, find the 

{probability that out of 5 vessels expecteJ to arrive, 4 at least will 

<arnve safaly, (Agra 59) 

14 x9 4 


Jads 


10 


] 


2. An ordinary six-faced die is thrown 4 times. What are 

-the probabilities of obtaining 4,3,2,1,0 aces. (Agra 56) 

3. A and B have equal chances of winning a single game. A 
wants « games and B{n-Y\) games to win a match so that the 
•odd in favour of A are 1+P to 1-P where 

P=-^r. IM. Sc. Kerala] 

lnlr2* n 


411. Use of Multinomial Expansion in Probability. 

There ore n dice with f faces marked from \ to f \ if these are 
a hr own at random , calculate the probability of obtaining a toted 

•equal to.p. 



1 34 


Mai fie metical Statistic? 


The total number of ways, all assumed equally likely, in which' 
n dice may appear is / n , assuming that the dice are distinguishable. 
The number of favourable ways (in which the total number of 
points is p) is equal to the number of ways in which n integers 
ranging from 1 to/can add up to p, K n<^p^fn). 

This number is the coefficient cf x p in 


/(a) = (a+ x 2 +jc®+ +*')". 

s nee every favourable arrangement contributes one term to* 
a p in the expansion of /(*). 

Now /(a) = a" (1 +.V+...+A / ' , ) r> 


so that the coefficient of x p in /(a) is the coefficient of x* n in 

(i- a/" (i — a) *. 



lequired probability 


Coeff. of \* ,_n in (1 — a^I" i\ —x)~ n 

/" 


Illustrative Examples 


Ex. 1. Four dice are thrown. Find the c/unce that the suny 
nf the numbers appearing will be 18. [Punjab 56] 

Sol. Each dice can be thrown in 6 ways and so four dice 
can be thrown in 6‘ ways 

The number of numbers that can be thrown with four dice is 
the sum of the coefficients in the expansion 

(a -b a 2 -f a 3 -b a 1 -b a 5 a V. 

The number of ways of throwing IS is the coefficient of A lf * 
in the above expansion. 

The coefficient of a 18 in (a- a -2 -f- a 3 +a 4 + x 5 -b a 8 ) 1 

= Cc efficient of a 18 in a* ( 1 +a + a-’ + aH-a 4 f a 6 ) 4 

= „ „ X U in (I +A-bA a -bA 3 i A*-bA 5 )* 

= ,, ,, A 11 in (I — a 0 )* (1 — y) -4 




„ v 14 in (I— 4x r, -b6v J 


_l 9. 10. II ai 

+-;--v 8 b 


15.16.17 *1.]0. I l 


. ) h + 4.Y+10.v a + ... 
. • '- 5 ./ I7 ,..) 


+ 6.10- 80. 


Ihe required elm nee 


is tin refore¬ 


st) 

6 r 


5 



r 



jpheory of Probability 


135 


Ex. 2. Four tickets marked 00, 01, 10, 11 respectively are 

placed in a bag. A iicke, is drawn a, random five times, bei g 

■replaced each. lime. Find Ike probability that die f 

■numbers on the tickets thus drawn is 23^ *» 

So! Regard being had to the d.lferent ways of making up 

the same total, of the number of numbers that can be obtained 
from five tickets is the sum of the coefficients m the expans,on 
/^ O 0 +X O l+^ l 0 “l'' Xn i G t C - ^ ' 

_fM4_y)J-r 10 (1+x)]° = (M-X ) 5 (14“^ ) 

= tctU *4...+*u+* e .*“+^+-) 

tL sum of the coefficients will be found by putting x-l. 
The total number of possible numbers ,s therefore 2 . 

The coefficient of ** in the above expans,on u 

^ x ^=lrr) =l00, 

100 25_ 

fhc required probability —210 ^55 

Ex. 3. Determine dir probability of throwing more '^^ 
With 3 perfectly symmetrical dice. * 

So!. The numbers more than 8 are 9 i0 

4 ~mnnleiPentarv set to these numbers is {3, 4, 5,. 

" ttai number of ways in which three dice can be thrown ,s 6*. 
Number of ways for obtaining the sum from 3 to 8 
!s!,n of the coefficients of *>,*«....,*• 4n the expans.cn 

{X + X* + X 3 +- + 

= Sura of the coefficients of a ,a ,* >•••»* ™ £ 

(l + x+x* + ... + *V , ,, ,_ 3 

— Sum of the coefFs. of x 0 t x l ,x*... t A - in (t — a 6 ) ( x 

~ ( f ;-3^+3x»-x“) (l+3x+6x"+IOx»+15x*+21x*+...) 

•=1 b3 + 6-blU+15 + 21 = 56 ^ 7 

/. probability of obtaining a sum from 3 to 8-^r = 27 

7 _.20 

the required probability —1 — 27 27’ 

Exercises 

1 Find the probability or throwing 5 points with one dice, 
SO points w„h two d,ce and 15 points w„h three dice. ? 

IHim. 15 cun he made up as 5, 5, 5, 3, ^ 5/u)gJ 



m 


Mathematical Statistic9 


2, Find the chance of throwing 10 with (a) 3 dice, (b) 4 dice. 

[Agra 56, Goj 58, Punjab 57} 

[Ans. 1/8,5/81) 

3. Counters marked r, 2, 3 are placed in a bag, and one is- 

drawn and replaced. The process is repeated thrice. What is the 
chance of obtaining a total of 67 [Punjah 58) 

[Ans. 7/27) 

Nine cards are-drawn at randbm from a set of cards. 
Each is marked with one of the numbers, 1, 0, —1, and it i& 
equally likely that any of the three nu mbers will be drawn. Find 
the charce that the sum of the numbers on the cards thus drawn 

1SZero * [Agra 60] 

3139 


Some Worked Examples. 


^Ans. 


39 


J 


Ex. 1. It is 8 : 5 against a person who is 40 years old thing: 

tSl he is 70 and 4 : 3 against a person now 50 living tiil he is 80. 

Find the probability that one at least of these persons will be alive 

:0 years hence. [Punjab 58) 

The probability that the 40 years-old man will die 30 years 
hence = 8/l 3. 


The probability that the 50-years-old man will die 30 years- 
hence = 4/7. 

Hence the probability that both will die 30 years hence 

13 7 91 

The probability that one at least of these persona will be alive 

3 * 59 

30 years hence= j —-l = _ . 

91 91 


Ex. 2. If the probability of success is l/iOO, how many triah 

*re necessary in ordtr that probability of at least one success /> 

greater than £ ? 

[Given that log 99=1'S956, log 2 = 0 3010] 
p = probability of success = 1 / 100 . 

9= 1 —/^probability of failure = 99/100. 


The probability of all failures in 
Hence the probability of at least 


” trials= (rub)" 


/ 00 \n* 

one success= 1 — j— I r 

\100 ) 



Theory of Probaoilitv 


137 


or 

or 


Hence 


1 


■ &3)’<i 

’ n [log 99 —log I00]<log I—log 2 
//(l 9956 — 2)<0—0*3010 
0 3010 


n> 


-=68*4 


0*0044 

Hence the number of trials necessary = 69. 


Ex. 3. A, B and C in order toss a coin. The first one to 
throw a head wins. What are their respective chances of winning ? 
Assume that the game may continue indefinitely ? 

(Agra 61, 65 ; I A S. 55] 
The probability of throwing a head = $. 

A can throw a head either in the first, or fourth, or seventh,... 
chance. 


.*. Probability of A throwing a head 

“l-Mii 8 - i + (i) 6 .i+ ••• 00 

l-£ 3 7* 

Similarly probability of B throwing a head 

= i i + + ... <*> 

*=£[£ + (£) 3 .£ 4-... oo] 

= i —1 . = 2l 

7 

und by similar reasoning probability of C throwing a head 

= 1/7. 


This problem can be tackled more easily as follows. 

Let x be the probability of A throwing a head in the long 
run Then the probability of B throw.ng a head is £ times the 
probabilihy of A i.e. x/2 and the probability of C throwing a 

head 


V X 

is ^•y =3 2 r * Now it is certain that one of A. B, C 


will 


throw a head in the long run. Hence 

. x , x 
*+ ~2 = 1 


=> x = 




138 


Mathematical Statistics 


• Xhe respective chance of winning are 

4 2 i 

1 ’ ? * 7 ' 

Ex 4 (Problem of points) 

payers .'1 ^ B uwif respectively m and n points ofwmn - 
ingasetofgames.theirch.ncesof winning a single game are p 

-and q respectively, where the sum ,f p and q is unity the stake , 

,o belong to the player who fin, makes up his set. Deternunc the 

probabilities in far our of each player. . 

The set will necessarily be devided in ni+n 1 g •» 

A may win his m games in exactly m games, or m + l games,-, 

or ni + n— 1 names. 

Suppose that A wins in exactly m+r games, where the possi¬ 
ble values of r a-e 0 , I, 2 , For this he must win the laa 

game and m-l out of the preceding m+r-1 games. The chance 

of this is f 

m + r-\ c, n -i p m ~ x q r P, or m + r— 1 c m . x p q 

Thus ,4’s chance is 


n_l 


v m +r -1 <m-i P m q T 

r — o 


Similarly B's chance or replacing '» by n and q by p >s 


'Vn + r-l Cti-i <] n P 

T~0 


Exercises 

1 . In a certain game A's skill is to B's as 3 to 2 : find the 

chance of A winning 3 games at hast out of 3. ^ 

[Ans. - iT2T ] 

2. The probability that a 50 years odd man will be alive at 

60 is O'3 and the probability that a 45-years odd woman wilt be 
alive at 55 is 0 87. What is the probability that a man who is 50 

and wife who is 45 will be alive 10 years hence. |Agra 61] 

[Ans. 0 7221J 

^ Suppose it is 0 to 7 against a person A who is now 35 
vcirs of a«*c living t.U he is and 3 to 2 against a person B no.v 
’iving till he is 75: find the chance that one at least ol these per¬ 
sons will he alive 30 years hence. lAns * 

4. A. B, C, D cut a pack of cards successively in the order 



139 


Theory of Probability 

mentioned. What are their respective chances of first cuttmg a 

spade. 


[ 


Ans. 


64_ 

1 


- > 


[Agra 61] 

48_ 36 

175’ 175 
A and D alternately 
a throw he obtains 


£7 ' 
175. 


5. Huyghcn's problem. Two players 
roll a pair of unbiased dice. A wins if on 
exactly six points before B gets seven points, B winning in tie 
opposite event. If A begins the game, pro\e that ns c lance o 
winning is 30 / 61 . and that the expected number of tria.s lor a s 

win is approximately o. ; _ . . , 

[Hints. Six points can be made up as 1,5; 2, 4; ., 3 an t 

number of ways in which threse happen is — 2 4-2+ 1 — 5 
Hence probability of throwing 6 = 5/36. 

Similarly, since seven points can be made up as 
1, 6; 2,5; 3, 4 : the probability of throwing 7 = 6/:6 or 1/6. 

* Probability of A winning the game 


5 

36 

5 

36 


,/3lW5\ 5, /HWIV 1 . 5 - 

\36 ) \ 6 ) * 36 \ 36 / V 6 / 36 


4-... 


30 

61 


+ ■■ 00 


1 — l5?/il6 

Expected number of trials 

= 6 approximately. 

(Expectation — S /><*<)! 

6. A six faced die is so biased that it is twice as likely to 
show an even number than an odd number when thrown. It is 
thrown twice. What is the probability tli.it the sum of the two 
numbers thrown is even ? 

7. In each of a set of games it is 2 to 1 in favour of the 
winner of the previous game. What is the cliancc that the player 

who winns the first game shall win three at least of the next four ? 

[Ans. 4/9] 

8. Given thice urns, of which the first contains 3 white and 
4 black balls, the second contains 2 white and 3 black balls, and 
llie third contains 3 white and 5 black balls. What is the pro¬ 
bability of obtaining one whi e ball in extracting a ball iioin each 

urn 7 FAns * ~ ^ 

[ 280 


] 



Mathematical Statistics 



^Hint. Required probability is the coefficient of x in 

(K) (r'4) (b4)} 

9. A shooter finds that, on the average, he kills once in three 
shots He tires three times at an enemy, on the assumption that 
his priori probability of killing is what is the probability that 
he kills him ? 

[Ans. 19/27J 


4 12. Probability and Number-theory. 

Ex. 1. Tchebycbeff’s Problem. Two integers lie within the 
rouge 2 to N. What is the probability that they are prime to one 
another ? 

Sol. Any number, when divided by a suspected prime factor 
r t may have a remaindei 0, 1, ..., r— I, hence the probability that 
it is divisible by r is 1/r. Thus the probability that both the 
integers are divisiole by r is 1/r-, and, therefore, the probability 
that both are not divisible by r is 1 —l/r*. It follows that the 
probability that the two integers have no common prime factor 
over the whole range is 



where p is the greatest prime in the given range 2 to N. 

If N (and therefore p) is large, x can be approximated as 
follows : 

We suppose that 
x tT the infinite product 


('-MKK-rl--0-M 

where r is always prime. 

- M-fH-F r- (,4)" 

= i i+ F + F + -)( i+ 3^ + i + -) 


and since any number is either a prime or a product of primes, 
it follows on multiplying out that 


L = 1+ 2 L+ 3 I> +~ + 4=7- 


•. 6 _ 3 

Hence x = —., —~ 

71 3 


, approximately. 



Theory of Probability 


141 


Ex. 1. A five-figure number is formed by the digits 0,1,2,3,4 
without repetition). Find the probability that the number formed 
is divisible by 4. (Agra 59] 

Sol. Digits 0, 1, 2, 3, 4 can be arranged in 5 ! ways out of 
which the numbers which start with 0 are 4 ! in number. 

Hence the total number of five-figure numbers 

= 5 J — 4 ! = 96 


Now the numbers ending in 04, 12, 20, 24, 32, 40 will be 
divisible by 4. 

No of numbers ending with 04 —total number of ways of 
of arranging the remaining number I, 2, 3 = 3 ! = 6. 

Evidently this is also the number of numbers ending with 20 
and 40. 

No. <\f numbers ending with 12 = 3! — 2 ! = 4, for those 
numbers which start with 0 are to be discarded and their number 
is 2 ! 


Evidently this is also the number of numbers ending with 24 


and 32. 

Hence the total number of five-figure numbers, formed by the 
digits 0, 1, 2, 3, 4 and divisible by 4 


= 3 (6-f 4) = 30 
required probability = 


Ex. 2. Out of 3n consecutive numbers , 3 are selected at 
random. Find the chance that their sum is divisible by 3. 

[Agra M.Se. 53] 

Sol. Any three of 3 n consecutive integers can be taken in 
in cz ways= ]} —— ways. The 3 n consecutive integers 

1 • Z • 


can be arranged as follows : 

3x : ?, 6, 9, 12, 15, .. , 3 n 

3y-l : 2, 5, 8, 11. 3n— 1 

3z-2 : 1. 4,7, 10. 3n-2 

Evidently the sum of three numbers will not be divisible by 3 

if 2 numbers of any one order from 3x, 3y — \, 3z —2 and third 

from the other may be taken where x, y, z can have any value 

from 1 to n The number of ways of selecting three numbers in 

... On 3n(n— 1). 

this manner = 3n c 2 2n c x -= ---2 n 

= 3 n 2 (n- 1) 



142 


Mathematical Statistics 


the probability that the sum is not divisible by 3 
_ 3/i 2 (//— 1) 6__ 6n- (n-\) 

3 n (i/i-l) 13/1—2) nyin-\) (3/7-2) 

Hence the probability that the sum is divisible by 3 is 

6/j (n —1) _ ~n ‘ — 3>;4-2 

1 (3 /, — 1) (3/1—2) 9 / 7 * + 9/7 + 2 

Ex. 3. Out of (2/i+I) tickets consecutively numbered, three 

are drawn at random. Find the chance that the numbers on them 
are in arithmetical progression. [Agra 50, Delhi 69] 

Sol. The total number of ways of drawing three tickets 

(2/i+D 2/7 (2/i-l) 


(' 

Suppose if the first number drawn i> 4, then the sets of three 
numbers in A.P. beginning "ith 4 will lie 

4.5, 6;+ 6, 8; 4, 7, 10; ,4. (/i+1), (2/7-2) ; 4, (w + 2), 2n 

whichare obviously // 2 in number (being the number of num¬ 

bers from 5 to /i + 2, both numbers inclusive). 

In a similar manner, by virtually writing down the sets of 
numbers that can be in A.I\, we write down the following 

schedule : 

Lowest number of the thr-e 

1 
2 

3 

4 

5 

6 


No. of favourable ways 

n 

n 1 
n 1 

/i — 2 

n — 2 

n — 3 


2/J-2 

2 / 7-1 

total number of favourable ways 

= /7 + 2 [1+2+3+..•+(/! 1)] 

// f/i-n 


= ;7-p2 . 


— /7 ‘ 


//“ 6 


3/i 


••• required probability ^ ) 4/.* — I' 

Ex 4 . Jf four whole numbers taken at random are multiplied 
together , show that the chance that the last digit in the product is 

1, 3, 7 or 9 is [Agra i\l Sc. 61] 


Theory of Probability 



Sol. If the last digit in the product is not 1,3, 7, or 9 i i 

must be 0, 2, 4, 5, 6 or 8 . 

Therefore none of the four numbers must end in 

0, 2, 4, 5, 6 , 8 . 

The chance that each of the four numbers should not end in 
any of these is 4/10 ie. 2/5. 

by compound probability, the required chance 

= (1Y= ] ± 

V 5 ) 625* 

E«. 5. If n integers taken at rvulom arc multiplied together , 
show that (a) the chance that the last digit of the product is 1.37 
or 9 is (2/5)”, ( b) the chance of its being 2. 4, 6, or 8 is 

, ( c ) of its being 5 is —, and (</) of its being 0 is 

lO" — 8 n —5 n -f-4 n 
10 " 

Sol. (a) If the last digit be 1,3, 7, or 9, none of the number 

can be 2, 4, 6 , 8 or end in 0 or 5; that is, we have a choice ol 4 

digits with which to end each of our n numbers. 

4 n / 2 

Thus the required chance = — n = I 

(b) If the last digit be 2, 4, 6 or 8 , none of the numbers 
can end in 0 or 5 and one of the last digbs must be even. 

Now, 0 and 5 can be excluded in 8 '* number of ways; nr J of 
these we have further to exclude the 4" ca f es in which the last 
digit can be selected solely from 1, , 7 or 9 Thus the required 
chance 

£n_ 4 /i 4 n — 2 " 

TO" 5" ‘ 

(c) If the last digit is 5, one of the numbers must end in 5 
and all the rest must be 1, 3, 5, 7 and 9. Now 5" is the number 
of ways in which an odd digit can he chosen to end the number, 
but to ensure 5 being one of them we must exclude the 4 n ways 
in which our odd digit can be chosen solely from 1, 3, 7 or 9. 

5" — 4 n 

•\ the required chance = • 

(d) The product will end in 0, if the product does not end 
in 1, 3, \ 9; 2, 4, 6 , 8 ; and 5. The probabilities for these have 
been found above 




144 


Mathemat cal Statistics 


• » 


2 " 4 *>_ 

the required chance = 1 — —-^ = - 


5" — £ 
10 " 


__ 10 n - 8" —5"4 4" 

10 " 

Ex. 6. If 6 n tickets numbered 0,1, 2, . , * n— I are placed 
in a bag and three are drawn out , show that the chance that the 
sum of the numbers on them is equal to 6 n is 

3 n _ 

(On — 1) (6// — 2) 

Sol. Total number of ways in which three tickets can be 

0nf0n — \)(0n 2) 


drawn 


1 . 2.3 


= n(0n — \)(0n — 2). 

To find the number of ways in which the sums of numbers 
drawn is 6n we proceed ns follows so with the number 0, 6n can 
be made up in all possible wa\s from two of the numbers in 3// —1 
ways [the number just preceding i (6n— I)] 

With the number I, 6n— I can be made up from two of the 
numbers 2 3, . , 6/i —l ini (6n— 4) i.e. 3n—2 ways, for the 
combinations 2, On — 3; 3, On — 4; ... are only permissible and two 
of the last numbers out of 2, 3,..., On — 1 are rendered useless. 

With the number 2, On — 2 from two of the numbers 3, 4,..., 
6/i—1 can be made up as (3, On— 5); (4, 6/i — 6); which form in 
all 3/» — 4 ways. 

With the number 3, On— 3 can be made up from 4. 5, 6, 

6/i — 1 in 3/i — 5 ways Finally with the number 2n-2, there are 
only two ways of making up the sum On, viz, 2/i—2, 2n— 1, 2/1 + 3 
and 2 /j — 2, 2a, 2/i ! 2, whereas with the number 2 / 1 — 1 , there is 
only one ways of making up On viz, 2/i— 1, 2//, 2//-Hi. 

Hence the number of ways of making up On is the sum of 
2/i terms which may be arranged in pairs as follows : 

l(3/i —l)-H(3/i—2)}-H{(3// — 4) -f( 3/i — 5)} +... 

4 (5+4)+ (2+1) 

= (6/i —3) + (6'z —9) + ... + 9 + 3 




— S (6n—3)=6 27 /i —3n 

n a l 


= 6 


/H/1+1) 


= 3/r. 




theory of TrobabiTity 


1*5 


• • 


required chance--^ J,") (6»^2) = l 

Exercises 


1. Four positive integers are chosen at random. Find the 
chance of their having a common factor. 

M'-SKl 

2. A number consists of 7 digits whose sum is 59. Find the 

chance of its being divisible by 11 [Ans. 4/21] 

3. An integer is chosen at random from the fust two hun¬ 

dred digits, what is the probability that the integer chosen is divi¬ 
sible by 6 or 8 ? [Agra M Sc. 67] 

|Ans. £] 

4. Four different objects 1, 2, 3, 4 are distributed at random 

on four places marked 1, 2, 3, 4. What is the probability that 
none of the objects occupies the place corresponding to its num¬ 
ber ? I Ans 3/8] 

5. If from a box containing 100 tickets numbered 1 to 100, 
two tickets are drawn at random. What is the probability that 

the numbers on the tickets differ by more than 10 ? 

(Ans 89/990] 

6. A bag contains 50 tickets numbered 1,2,3,..., :0 of 
which five are drawn at random and arranged in ascending order 


of the magnitude 

<A' 1 <x a <x 3 <x 1 <A- 6 ). What is the probability that x 3 «»30 ? 

T. 29o*x20r»l 

| AnS - i 0c- 6 - J 

7. An urn contains 4 tickets numbered 1, 2, 3, 4 and another 
contains 6 tickets 2, 4, 6, 7, 8, 9. If one of the two unis is chosen 
at random and a ticket is drawn at random from the chosen urn. 
find the probability that the ticket drawn bears the numbers. 

(i) 2 or 4 (ii) 3 (iii) 1 or 9. 

(Ans. 5/12, 1/8, 5/24] 

8. Find the probability of getting 37 in randomly forming 
2-digit numbers from out of 1, 3, 5, 7, 9, repetitions of a digit be¬ 
ing allowed in forming the numbers ? 

9. '1 he sum of two positive quantities is equal to 2/i. Find 

the chance that their product is not less than 3/4 times their gre¬ 
atest product. lAns. £] 



146 


Mathematical Statistics 


10. Two different digits arc selected at random from the 
digits 1 to 9. 

If the sum is odd, what is the probability that 2 is one of the 
numbers selected ? [Ans. J] 

11. What is the probability that among k random digits 
(a) neither 0 nor 1 appears, (b) zero appears exactly 3 times (3<&> 

8* . . 9*- 3 


[ 


Ans. (a) 


10 *’ 


(b) 


lot 


] 


412. Trees and State diagrams. It is possible to give a 
graphical interpretation to certain simple problems of probability 
which arise in dealing with repeated trials of an experiment. For 
example, in repeated throws of a coin we might call heads success, 
and tails failures. We assume there is given a probability p for su¬ 
ccess and a probability q=\—p for failure. The tree measure fosr 
a sequence of two such experiments is shown in the figure below. 



The probability of getting, say HT can be directly computed 
Irom the weighted length of the as 3 ociated tree path, that is pq. 

If it is desired to obtain the probability of getting a head and 
a tail irrespective of their order, then the answer is given by sum¬ 
ming up the two weighted tree path3. 

pq + qp = 2 pq 

This simple graphical procedure can be used profitably in 
certain types of prol lems. 


Theory of Probability w ' 

Ex. 1. Suppose that we have three urns, urn 1 contains 3 
red and 5 white balls', urn 2 contains 2 red and 1 white ball t urn 3 
■contains 2 red and 3 white ball. An urn is selected at random and 
a ball is drawn from the urn. If the ball is red , what is the pro¬ 
bability that it came from urn 1 ? 

Sol. Construct the tree diagram as shown in Figure 

given below. 



By Baye’s theorem 

_ P(\)P(R I n _ 

1 | D + AW | 2)-\-P(3)P(R ! 3) 

^ a 45 

= T1+ FirtT = i73' 

Exercises 

1. A coin is thrown. If a head turns up, a die is rolled. If a 
tail turns up, the coin is thrown again. Construct a tree measure 
to represent the two experiments and find the probability that the 
die is thrown and a six turns up. 

2. A box contains three defective light bulbs and seven good 
ones. Construct a tree to show the possibilities if three consecutive 
bulbs are drawn at random from the box (they are not replaced 
after being drawn). Assign a tree measure and find the probability 



i Vf sitAematical StartsrtCJ 


MS 

that at least one groacf bulb is drawn out. Find the probability 
that all three are good if the first bulb is good. 

[An9. H9/120, 5/12] 

3. A box contains three coins, one coin is fair, one coin is- 

two-headed, and one coin is weighted so that the probability of 
heads appearing is £ A coin is selected at random and tossed- 
Find the probability that head appears. (Ans. 11/18}: 

4. We are given two urns as follows : urn A contains 3 red 
and 2 white balls, urn B contains 2 red and 5 white balls. An urn 
is selected a$ random' a ball is drawn and put into- the other urn; 
then a ball is drawn from the second urn. Find the probability 
that both balls drawn are of the same colour. 

[Aus. 901/1680] 

5 There are t.vo urns, A and B. Urn A contains one black; 
and uiie red ball. Urn B contains two black and three red Dulls. 
A ball is chosen at random from urn A and put into urn B. A ball 
is then drawn at random from urn B 

(a) What is the probability that both balls drawn are of the 

same colour ? [Ans.7/12}: 

(b) What is the probability that the first ball drawn was red^ 

given that the second ball drawn was black ? [Ans. 2/5] 

6. A certain team has probability 2/3 of winning whenever 
it plays. 

(a) What is the probability the team will win exactly four 
out of five games V 

(b) What is the probability that team will win at most four 
out of five games ? 

(c) What is the probability the team will win exactly four 
games out of five if it has already won the first two games of the 
fi\ e ? 

[ 80 211 4 1 

A " S - (a) 243' (b) 243 ; (C) 9 J 

4 13 Inverse Probability. 

Suppose that we know that a particular event has happened 
in consequence of same one of a certain number of causes; we 
may feel interested in estimating the probability of each cause 
being the true one, and thence to deduce the probability of future- 


Theory of Probability 

events occuring under the operation oT the same causes. Problems 
•of-this nature fall in the domain of Inverse probability. 

Theorem. An observed event has happened through some one 
•of a number of mutually cases ; to find the probability of any 
*assigned cause being the true one. 

Proof. Let there be n causes for the event to follow and 
(before the event took place, let their respective a priori probabilities 
(be estimated at P lt P« Let /v denote the probability that 

•when the r th cause exists the event will follow. After the event has 
•occurred it is required to find the c posteriori probability Q r that 

the r tfi cause was the true one. 

Consider a very great number N of trials, then the first cause 

•exists in P X N of these, and out of this number the event follows in 
.PiPiN ; similarly there are p t P*N trials in which the event follows 
(from the second cause : and so on for each of the other causes. 
Hence the total number of trials in which the event follows : 

The number in which the event was due to the r' h cause 

=p,PrN. 

Hence after the event the probability that the r' A cause was the 
'true one is given by 

_ _ PrPr _ _ 

P\P\ + Pz^'i + • • • 4 - • • -PnPn 

From this result it appears that G 1 + G 2 + —+ 0" or S((?) = l 
.and v/e can write 

P\P\ Pi^l PnPn S(pP) 'LipP) 

It is clear that in the present class of problems the product 
tP r p t will have to be correctly estimated as a first step; in many 
.cases, however, it will be found that />„ P 2 . P 8 •• are all equal, and 
the work is thereby much simplified 

The chance that on a second trial the event will happen from 
the cause is clearly p r Qr , for p T is the chance that the event w.ll 
happen from the r'» cause if in existence, and the chance that the 

*r ,A cause is the true one is Qr- 

Ex 1. There are 3 bags each containing 4 white balls and 3 
black balls . and 4 bags each containing 2 white bails and 5 black¬ 
balls : a black ball having been drawn, find the chance that it cvmr 

jfrom thejirst group. 



150 


Mathematical Statisticj 


Sol. There are 3 + 4=7 bags in all out of which 3 belong to 
the first group and 4 belong to the second group ; hence 

^1 = 3/7, i> 3 = 4/7 

If a bag is selected from the first group the chance of drawing 
a black ball is 3/7 ; if from the second group the chance is 5/7 ; 
thus />i=‘3/7, p 2 =5/7. 

p _ 9 p- 2(T 

Pl ^49’ P-^~4~9 


Hence the chance that the black ball came from the first 
group 

= _9W_9, 20\ 9 

4^ ’ \49 49/ 29* 

Ex. 2. 4 speaks the truth 2 out of 3 times, and B 4 times out 

of 5 ; they agree in the assertion that from a bag containing 6 balls 
of different colours a red ball has been drawn. Find the probability 
that the statement is true. I Agra B. Sc. 59} 

Sol. There are two possible hypotheses (a) their coincident 
testimony is true (b) it is false ; hence 

/\=l/6, P., = 5/6 


In case of the first hypothesis, the probability of both A an<3 

B agreeing in the assert ion is given by 

^ = 2/3 X 4/5 

In case of the second hypothesis A and B both do not actually 
draw a red ball but they assert on having drawn a red ball, thus 



(since after excluding the red ball, A and B have 

to choose one from the remaining 5) 


J X L 

25 3 


Hence Aft-g- X y X *- = -|; #>,*- 5 - X 

Hence the probability that the statement is true 
_ P I P\ __ 45 _40 

41' 


PiPi i- PtPa 


x L=J_ 
5 450 


Ex. 3. A and B are two very weak students of Statistics and 
their chances of solving a problem correctly are 1 /8 and 1/12 res¬ 
pectively ; if they obtain the same result, and if it is 1000 to 1 
against their making the same mistake, find the chance that the 
result is correct. [Agra B Sc. 581 


Theory of Probability 


151 


Sol. There are two hypotheses (a) A and B get the same 
•correct result (b) A and B get the same incorrect result, 
•chances of both these hypotheses are equal ; hence 

p - p r — ^ # 

When the first hypothesis exists, the probability that both -4 

-and B get the same correct result is 

Pi= 1/8x1/12. . 

Under the second hypothesis, the probability that both A and 

S get the same incorrect result is 

pi=tLt x ( 1 - §■) x l 1 - n) = 


ioor v s / v i 2 y 13.8.12 

Thus the chance that their solution is correct 


1 ii 


P\P\A-P*Pz L 

2 ' 8 


L+L 

12 2 


1 


13 

14 


13.8.12 

Ex 4 A purse contains 4 coins which are either rupees or ter. 
.paisa ■ 2 coins are drawn and found take ten paisa ; if these are 
replaced what is the chance that another drawing will give a rupee 
Sol. Here we can ;ormu!ate three hypotheses (a) all the coins 

may be ten paisa (b) three of them may be ten pa.sa » 
of them may be ten paisa. These hypotheses may be equally bkely 

-to occur. Hence 

P^P^P 9 =\ 

< Cz j*c t _ i_ 

and Pi= 4 - = 1» ^-4 Ce ~ 2 ’ Pz 'V B 6 

q/ = Q*-- q% 1 

P\B i Pi,P i 

gives 


Thus 


p 3 P a Z(PP) 


Qi 


1 


Q 


1+ 2 + r 


-h- «■“ 


h 


1+ i + i' 


3 

ie’ 


10 


Therefore the probability that another 
trupee 

=<QiX0)-f (0,xi) + (0 3 xi) 
i 7 1 51 


drawing will give a 



152 


Mathematical Statistics 


We can also interpret the question in the following manner. 

If each coin is likely to be a ten paise or a rupee, by taking 
the terms in the expansion of (i-f-i) 4 , we see that the chance of 
four ten paise is 1/16, of three ten paise is 4 c 3 (£) 3 (£) or 4/16, of 
two ten paise is 4c 2 (i) 2 (*)- or 6/16, thus 
^i = l/16, P 2 =4/l 6, P 3 = 6/16 
also, as before. 

Pi=*h Pi-h At =1/6 

Hence 2.'=—2!_=._?» _ __ 1 


1 


12 1 

— ~T — r" - 

16 Jo 1 


4 1 6_ |_ 

16 16’ 2 16 ' 6 

1 heretore the probability that another drawing v/ril give £> 


rupee 


*= «?l X ( ) + (0s X 4 l H- (03 + 


= — 4- ~ = 4 
a lo •" 

lx 5. There ha raffle with 10 tickets and two prizes of 

value Rs 500 and Its 100 respectively. 4 holds one ticket and is 

informed hy B that he has won the Its 500 prize, while C aster tv 

that he has the 7?v 100 prize : What is A's expectation , if the 

credibility of B is denoted tn 2/1, and that of C by i/4 ? 

5ol. There are three hypotheses: A may have won Rs 500 r 

Rs 100, or nothing, for both B and C may both have been mis" 
taken 

Thus p k = 1/10, A, = 1/10, A, = 8/10. 

Since B's credibility is represented by 2/3, and C's by 3/4, wc 


h'-*.Y 


Ai^2/3x 1/4 (B correct and C incorrect) 
n s x 2 (B incorrect and C correct) 

A* - i x i (£ incorrect and C incorrect! 

0 »&_<?» I 


eives 


pf\ pJ\ pff 3 

_ Qy 


271 pP) 

Q* 


o. 


J x -L X J J * 2 1 

3 ' 4 10 3 ' 4 10 


|x2x2 

3 4 10 


„ r Qi^Qt^Qo 1 

2 3 O 13 


whence ^=2/13, <? a =3/13, <?, = S/|3- 



Theory of Probability 


153 


Thus A's expectation 

= ^ ofRs 500 + ^| of Rs I00+-*of zero. 
= Rs 100. 

EXERCISES 


1. There are four bills in a bag. but it is not known of what 

colours they are; one ball is drawn and found to be white : find 
the chance that all the balls are white. |Ans. 2/5] 

2. In a bag there are six balls of unknown colours; three 
balls are drawn and found to be black; find the chance that no 
black ball is left in the bag. 

[Hint. The four hypotheses are 6 black balls, or 5, or 4, 
or 3, and these are all equally likely. 

6c* , 5 r 3 4r 3 3r 3 


And pi=^r=l, 


5Ci n _4r 3 _3r 3 

✓ * Ih — 7~ * P\ — 

0C;j 0C3 or 3 


Pl __ P'i _P* __Pi (P) 


(p ) . , , Pi r 

35» rec i uir€d chance- 


•* 20 10 4 I 35 - Sip) 35 J 

3. Before a race the chances of three runners. A, B , C. were 
estimated to be proportional to 5. 3, 2, but durii g the race A 
meets with an accident which reduces his chance to one-third. What 
are now the respective chances of B and C ? 


[Hint. A could lose in two ways: either by B winning or by 
C winning. The probabilities of these two events are 3/10, 2/10 
respectively. Therefore A's a priori chance of losiiv: was /IO-f-2/10 
or But after the accident his chance of losing becomes 2/3; that 
is, his chance of losing is increased in the ratis of 2/3-ri- or 4 to 
3. Therefore, also, B's and C's chances of winning are increased 


in the same ratio, 
and C*s chance of 


Thus B.s chance of winmng = 3/10x 4/3 = 2/5; 


winning = 


4 

3 



4. A speaks the truth 3 out of 4 times, and B 5 out of 6 
times; what is the probability that they will contradict each other 
in stating the same fact ? 

[ Hin ' : H + I h ] 

5. A die is thrown three times and the sum of the three 

numbers thrown is 15 : find the chance that the first throw was 
four. [Ans. 1/5] 



154 


Mathematical Statistics 


6. From a bag containing n balls, all either white or black, 

all numbers of each being equally likely, a ball is drawn which 

turns out to be white; this is replaced, and another ball is drawn, 

which also turns out to be white. If this ball is replaced, prove 

that the chance of the next draw giving a black ball is \ (n — 1) 
(2/i-|-I) -1 . [Lucknow B Sc. 65| 

[Hint. There maybe 1, 2, 3,..., n white balls, and all there 

cases are equally likely, so th.it l J l = P. i =P 3 =... =P n . 

If there were r white balls, the chance of drawing two white 
balls in this case would be 

(r/n) Ar/n)=(r/n ) 2 

(?. _ 03 _ Qr__ 1 

"■ W")* ”■ £U-/'0- 

1 1 6/i 


(I /nr (2//I)- 

Or 


Thus 


and 


(r/wj- 

Qr = 


6r 2 


/i(// + l)U«+U/t> (/i-t- iJU'H-0 


npi-y i )un+ 1) 

And the chance of another drawing giving a black ball. 


n 


(l— r//j). 


6r> 


(/*— i)(2/i-t-1) 


<■ 1 /.(//-t- IM2/I-H) 

7. There is a raffle with 12 tickets and two prizes of Rs 900 



and Rs 300. A, /?, C, whose probabilities of speaking the truth are 
§, i respectively, report the result to D , who holds one ticket. 
A and B assert that he has won the Rs 900 piize, and C asserts 

that he has won the Rs 300 prize; what is D's expectation ? 

[Ans. Rs 500/3]. 

8. A bag contains 6 black balls and an unknown number, 
not greater than six, of white balls ; three are drawn successively 
and not replaced and are all found to be white ; prove that the 
chance that a black ball will be drawn next is 677/9C9. 

9. An urn contains lour balls, which aie known to be either 
(i) ail white or (b) two white and two black. A bail is drawn at 

random and found to be white ; what is the probability that all 
the balls are white ? [Ans. 2/3] 

Miscellaneous Worked Examples. 

L\ I. Cards are (halt one by vne from a well shuffled pack 

until an ace appears Show' that the probability that exactly n 
cards are dealt he jure the first ace , m 

(51 -//) (50-/7) (49-//) 

-m a. 54 , i.c.a r. 50 ] 



Theory of Probability 


155 


If cards continue to be dealt until a second ace appears , prove 

that the probability that exactly r cards are dealt in all before the 
second ace , is 


r (51— r) (50 — r) 
! 3.17.50.4V 


[I.C.A.R. 50] 


Sol. Since in n cards the ace does not appear, its probability 


is clearly 


48 ,, 

<- n 


hz 


Of 


Now after n dealings we are left with 4 aces and 


52 —n other cards ; hence the probability that the (/i-f-l) card 


44 

must be an ace is ——Therefore by the rule of compound 

52 — n 


probability, the required chance is 

4 Vn x 4 48 ! x n ! (52-n ) 1 x 4 

62 c„ 52 — // ° r n ! (48— / 1 ) ! 52 ! 82 — n 


CH -n) (50— n) '49 -n) 
j.3.8i ,5o. m9 


In any one of the (r) cards the probability of drawing an ace 

= ( ‘c a x 48 c r _ 1 j/ 5 -c r . 

When the (r-f-1)' A card is to be drawn we are left with 3 aces 
and (52 — r) other cards. The (r-f-1 card has to be an ace 


whose probability is 


3 

52—r' 


Hence by the rule of compound 


probability, required chance 

= V, X^cr-i x 3_ = _48 !_ r ! (52 - r) ! 4.3 

°*c r 52 —r tr-lj ! (49— r) ! :2 ! '5 2-r 

_ r (51 -r) (50 —r) 

13.1/.5u.49 

Ex. 2. A person writes n letters and addresses n envelopes ; 
if the letters are placed in the envelopes at random , ( 1 ) what is the 
probability that every lee ter goes wrong l (/'/') what is the pro¬ 
bability that exactly k letters are placed in the correct envelopes ? 

[Bombay, M.A, 61, Delhi 66, Punjab, M.A. 62] 
Sol. Let u n denote ihe number of ways in which all the 
letters go wrong, let abed... represent the arrangement in which 
all the letters are in their own envelopes. Now il a in any other 
arrangement occupie. the place of an assigned letter b t this letter 
must either occupy a' s place or Sume other. 

(i) Suppose b occupies u’s place. Then the number of ways 
in which all the remaining n — 2 letters go wrong is and there¬ 
fore ti»e iiumb.r of ways in which a may be displaced by inter- 



i 56 


Mathematical Statistics 


change, with some one of the other n—l letters, and the rest will 
go wrong is 

(ii) Suppose a occupies 6’s place, and b does not occupy fl’s. 
Then in arrangements satisfying the required conditions, since a is 
fixed in 6’s place, the letters b y c y d,... will all go wrong in w„_i 
ways ; therefore the number of ways in which a occupies the place 
of another letter but not by interchange with that letter is 
(n— 1 )//«-, ; 

•• — 1) (Wn-lH~Wii_a) 

or f/„—m/n-i = (— I) (Un-j — tl - 1 U n - 2 ) 

This expression provides a recurrence relation 


Hence u n -nu n -1 = ( 


- / \ — 


* 


= (—l ) 2 —n — l Un-z) 


• • • • • • 


But w 1 = 0, Uo~l » thus 

«n-1 (—1) 


u n —mi n 


n ! (n — 


(«-D! 

Putting n=2,3,... y n and adding, we get 


n 


Un 




=n! lrrri + rr- + 


zlLl 

nl ( 


Now the total number of ways in which n letters can be put 
in n envelopes is n 1 ; therefore the required chance is 

L-L + 1 _.(-!)" 

II. k letters can be chosen in n c k ways and the remaining 
n — k letters will go wrong in itn-t ways ; hence 

Required probability ^*£=* =± . ^ 


i JL- 

k\l2l 


1 + +<- l) 

3, + - + (« 

1 


~k) !J 


T\L ( " 1), yf for ^°, 1,2,3, 


Ex. 3. An urn contains a white and b black balls. Balls are 
drawn one by one until only those of the same colour are left. What 
is the probability that they are while ? 



Theory of Probability 


157 


Sol. Only the white balls in the urn can be left either after 
^ or 6-fl or b-\- 2, or b-\-a— 1 draws 

Suppose that the white balls only are left after b+r draws; 
hence in b-\-r— 1 draws we should positively draw b— 1 black balls 
and the last ball drawn should be black. The probability for this 
obviously is 

6+r “ , C»- 1 /° + *<*» or b+r - l crl a ^c b 
where r=0, 1 , 2, a— 1 . 

Hence the required probability is 

Z * 6 fr - 1 c w 0 + V 6 

r-0 


0-1 

Now 2 + 

r~0 


6+-1 


Co ■+■ . ■ T 


b+a-'. 


Co-1 


= term independent of .x in 

(i+x)»-i+i r (i+x)>+4(i+x)> ) -i+...+ -^rd+jej 

XX A 

= tcrm independent of x in 

+ ^ + (it--c + ... + (!±-'n 

= term independent of x in 

i(\ _ 1 _ x )akb -1 

= term independent of x in <—~~- x(\+x) b 1 > 

= term independent of a in (1 + x) (iyb ^ J /x Q ' 1 


bra-'J 


— o+ 6 -l 


—1 • 


Thus required probab lity = a+b ~ } c a -il a+t, c l) 

_ in 4 b — l) ! b \ a \ 

Ui — J> ! b ! * (a-\-h) ! 
_ a 
a + b 


Ex. 4. A bag contains a coin of value M , and a number of 
other coins whose aggregate value is m. A person draws one at a 
time till he draws the coin */; find the value of his expectation. 

When the person draws the coin M, he ceases to draw. He is 
therefore to get M in all cases. His expectation is therefor M+x 
where x is to be found out. 


158 


Mathematical Statistics 


x =probable value of a draw 
=probability X value of a draw. 

Let there be n coins each of value mjn. 

Then the expectation of the person is M-\-p m/n , where p is the 

probability of his drawing coins of value mjn. 


n 


Now p= ——+ 

n-f 1 


n 


n — 1 




n 


n— 1 n— 2 


w + 1 n . ' n+1 n 

——l)+(«“2) + ... + l J 

tj(n4- 1 )_ n 


n -1 


• • • 


1 


n +1 


2l«+l) 

Expectation = A/-f ~ 

2* FI « 


2 

r 


m 


m 


Ex. 5. The probability that a family has exactly n children 
is a p n , and p 0 = 1—ap (/-fp-fp 2 +•• ) is the probability that 
a family has no children. All sex distributions of n children in a 
family have the same probability. Show thal the probability that a 
family contains k boys is 

2a p k (2-p)-*-\ k>l. 

Given that a family includes at least one boy t show that the 
probability that there are two or more is p/(2—p )• 

The probability that a family has exactly n children is a p n . 
How if this family of n children contains exactly k boys, its pro¬ 
bability is nek ih) k (£) n_fc , since the probability of a child being 
either a male or a female is the same i.e. 


Hence by the theorem of compound probability, the pro¬ 
bability of a family of n children containing exactly k boys is 

<xp n . n cjc (£)*(£)"“*, where n=k, £+1, n 4-2,. 

By the addition theorem of probability, the required probabi¬ 
lity is 


2 a p n . n c t (£)* (£)""* 

«<*)V J.-fP' 

- „ 1 -» ( -P- V k> , 

- a 2* P '(2— p) kJ ~ l 2 -p \ 2-p ) > 




Theory of Probability 


159 


Let A denote the event of a family including at least one boy, 

then P (A)^~ £ IrY 

2~P k-i\2—p J 

_ 2a. p 1 ____ op 

2~p ' 2—p ' 1 p (2 p)( 1 p) 

2 ~p 

Let B denote the event of a family including at least two or 
more boys, then 


P{B)= T~ d * f- 2 —)* 

2—p i-2 \2~pJ 

- 2a / p V 1 

2-/7* \2~/>/ ’ l-(/>)/(2- 




op 


(2-p)H\-p) 

We are required to find the conditional probability 

P(£ | A) = I - AnB - = ^ ) - y 

P.A) P{A)' 

since BCA and P(AC\B) = PyB) 


or 


P(B | A) = 


op 


(2-*)*<l ~P) 


(2 p)[ 1 —p ) 

op 


_ p 

(2—p) ‘ 

Observe /> 0 +/>i+/>* + ... *=1. 

Ex 6. Peter and Paul play a game with two dice. Peter 
plays first by throwing the dice together. If the total number of 
points is a prime number other than 2 he wins outright ; if it is even 
he throws again under the same conditio,is ; in other cases the 
throw passes to Paul , who throws under the same conditions. What 
is the probabil ty of Peter's winning ? 

Sol. The total numbers in which two dice can fall is 6x6 or 
36. The possible throws are 2.3,4,... 12 and the number of ways 
in which they can occur are : 

Total points 234567891011 12 Total 

No. of ways 12345654 3 2 1 36 

I. The probability of throwing a prime number 3, or 5, or 7, 
or 11 is 


24-44-64-2 

36 


or 


M 

36 


160 


Mathematical Statistics 


II. The probability of throwing an even number 2 or 4 or 6 
or 8 or 10 or 12 is 


14-3+5 + 5+34-1 

36 


18 
° r 36 


III. The probability of throwing-neither is 


36 —(14+18) 
36 


or 


4/36. The events I, II and III are mutually exclusive. Let P be 
the probability of Peter’s winning. Now if Peter throws a prime 
number other than 2 he wins outright, and the probability of his 
doing so is thus 14/36 ; if he throws an even number he throws 
again and his probability of winning in this case is ; if he 
throws neither the throw passes to Paul, whose chance is then P , 
so that Peter’s chance of winning is fo(l— P). Thus we have from 
total probability 

giving P= 18/22 = 9/11. 

Ex. 7. From a pack of 52 cards an even number of cards is 
drawn. Show that the probability that half of these cards will be 
red and half will be black is 

[ ' — 1 ^ / 4 51 — 1) 

\(26 ! ) 2 V 

Sol. For any integer n we have 

n c 0 + n o 2 + n c 4 +.. =V l + %+V 5 +... = i(l + l) n =2 n - 1 

2 // 1 

and ("c 0 ) 2 + (°c 1 ) 2 +... + ( n c n ) 2 = 2n c n = 


(n !) a 

("Co) 2 4("r,) 2 +... + ( n c„) 2 
= term independent of x in (1 +.v) n (1 + 1 /x) n 
= Coefficient of x n in the expansion of (1 +.x) 2n 


= - / V 


Total number of ways in which the even number of cards like 
2, 4, 6, ... etc can be drawn is 

52 c 2 + 52 r 4 + 52 c tf +.. =2 52 ‘ 1 — 52 c n =2 51 —1 

The number of ways in which half of the cards are red and 
half of the cards are black is 

(^e,) 2 +(*%)-+- + ( ! V ~ 1 ■ 

Hence the required probability is 

/ 52! M 



Theory of Prohjbility 


161 


Miscellaneous Set 


1. Let a y b , c, d be four non-negative integers such that 
<a + b-\-cd= I 3 Find the probability that in a bridge game the 
players North, East, South, West have a , b, c , ^spades respectively. 

[Bombay 1964] 


Ans. 


[(?=)--] 


2. Two digits are selected at random from the digits 1 
through 2. If the sum is even, find the probability that both 
numbers are odd. 


|Ans. 5/8] 


3. Four people, called North. South, East and West, are each 
•dealt 13 cards from an ordinary deck of 52 cards 

(i) If South has no aces, find the probability that his partner 
North has exactly two aoes. 



*50 1 

2 109J 


(ii) If North and South together have nine hearts, find the 
probability that East and West each has two hearts. 



4. If a symmetrical coin is tossed five times, what is the 

probability of three or more consecutive heads ? [Ans -}] 

5. A five-digit number is formed by writing the digits 

I, 2, 3, 4, 5 in a random manner. What is the probability that it 
is divisible by four ? [An*. &] 

6. Cards are drawn at random, one at a time with replace¬ 
ment, from an ordinary pack ot playing cards. What is the 
probability that all four suits have appeared in the first n draws ? 

If cards are drawn until all four suits have appeared, deduce 
the probability that exactly n cards will be drawn. 

[Ans. l-4(|t)" + 6(i)"-4(i)-, (J)"" 1 —3(i)' , “ , -f-3(J) n-1 . 

7. Three machines A, B and C produce respectively 60%. 
30% and 10% of the total number of items of a factory. The 



16 2 


Mathematical Statistics 


percentages of defective output of these machines are respectively 
2%, 3% and 4%. An item is selected at random and is found 
defective. Find the probability that the item was produced by 
machine C. [Ans. 4/25J 

8. An urn contains 4 white, 5 red and 6 black balls. 
Another contains 5 white, 6 red and 7 black balls. One ball is 
selected at random from each urn. What is the probability they 
will be of the same colour ? 

9 (a) According to a Gallup poll, 5 out of 7 are in favour 
of a proposal. What is the probability that if three people are 
chosen at random, there will be majority against the proposal ? 

(b) Find the probability that in a room w'ith r people, no 
two have the same birth day. 

Hint. There arc 365 possibilities for each person's birthday 

(neglecting February 29). There are then 365 r possibilities for the- 
birthdays of r people. The first person can have any of 3o5 days 
for his birthday. For each of these, if the secord person is to 
have a different birthday, there are only 364 possibilities for hi? 
birthday For the third man, there are 363 possibilities if he is to 
have a different birth day than the first two, etc Thus the pro¬ 
bability that no two people have the same birth day in a group of 
r people is 

365.364...f 365 — --i- 1)] 

4 J 

(c) What is the prababiiity that at least two out of n people 

have the same birth date ? [B. Sc. Delhi 71J 



(d) What is (he probability that 

fi) the birth days of twelve people will fall in twelve different 
calendar months (assume equal probabilities for the 12 months) ? 



(iij the biith days of six people will fall in exactly two 
calender months ? [B. Sc. Poona f5] 

f Ans. 1- -V C (2° —2}/12 9 J 
10. An urn contains a white and N—a black balls and n 
balls are dr lwn in succession at random without replacement. 
What is probability that the first r balls drawn are white and the 



Theory of Probability 


K3 


last n—r are black ? What is the probability that exactly s o«t 
of the n balls drawn are white ? 

11. If four squares are chosen at random on a chess board, 
find the chance that they should be in a diagonal line. 

[ AnS 22692] 

12. An urn contains 4 white and 5 black balls, a second urn 
contains 5 white and 4 black balls. One ball is transferred from 
the first to second urn, then a ball is drawn from the second uru. 
What is the probability that it is white ? 

[Ans. 49/90] 

13 Three urns contain respectively 1 white, 2 black balls, 
2 white and 1 black balls, 2 white and 2 black balls. One ball is 
transferred from the first urn into the second ; then one from the 
latter is transferred into the third. Finally one ball is drawn from 
the third urn. What is the probability of its being white ? 

[Ans. 31/60] 

14. In a group of equal number of men and women 10% 

men and 45% women are unemployed What is the probability 
that a person selected at random is emph*yed ? 

IA ns. 29/40] 

15. Cards are dealt one by one from an ordinary pack 

(without replacements) until two aces have appeared. Find the 
most probable number of cards to be turned up. [Ans. 18] 

16. What is the probability of getting 9 cards of the same 
suit in one hand at a game of bridge ? 



13 


30 - 4 - 

« 4 • < 1 


‘ J3 



17 fa) Find the probability that in a random arrangement of 
the letteis of the word 'UNIVERSITY’ the two I’s don't come 


together. 

(b) The chance of success in each trial is p. If pi- is the 
probability that there are even number of success in k trials, prove 
that 


ric=-p-\ Pk -1 ( 1 — 2 p) 

Deduce that Pk=\ 11+0— 2 pi n ] [B Sc. Bom. 70] 

[Hint. The event can happen in the following mutually 
exclusive ways : 

(i) in ( k —I) trials even number of success occur and then a 
failure with probability (1 — p) at the k lh trial, 



164 


Mathematical Statistics 


(ii) in (/c—l) trials, odd number of successes and then a 
success at the k th trial. 

Hence />*-=(! -p) />*_,+ ({-p^p 

=P+{\-2p)p k _ x ). 

(c) In a bridge game North and South have ten spades 

between them. Find the probability that either East and West has 
1,0 Spade - [B. Sc. Cal. 68] 

fAns. 11/50, 13/50) 

(d) Let A t B and C be any three events. If P(A n B(~) C) =1, 
P(AnBnC)==}, p(J nBr)C ) = i and P(A(lBr\C) = i. Verify 

whether A, B and C are (i) pairwise independent (ii) completely 

independent. 


18. A bag contains 6 white and 9 black bails. Four balls 
are drawn at a time. Find the probability for the first draw to< 

give 4 white and the second to give 4 black balls in each of the 
following cases : 

(a) The balls are replaced before the second draw. 

(b) The balls are nit replaced before the second draw. 

[A" 5 ' (a) 5925’ (b) 715 ] 

19. One shot is fired from each of the thiee guns E u E 2 , E % 

denote the events that the target is hit by the first, second and 
third guns respectively. If P(E,)=0 5, P(E,) = 0 6 and 
and E u 7T 2 , E 3 are independent events, find the probability that 
(a) exactly one hit is registered (b) at least two hits are registered. 

fAns. (a) 0 26 ^(b) 0 7] 
20 In the long rnn 3 vessels out of e'ery 100 are sunk. If 
10 vessels are out. what is the probability that : 

(0 exactly 6 will arrive safely, and 
(iil at least 6 will arrive safely ? 


21. Eight mice are selected at random from a large number 
and then divided into two groups of four each —group A and 
group B Each mouse in group A is given a do.se *a' of a certain 
poisor. which is expected to kill one in four. Each mouse in group 
B is given a dose of another poison which is expected to kill 
one in two Show that nevertheless, there may be fewer deaths 
in group B than in group A and find the probability of the 
happening ? 


[ 


Ans. 


525 


409 6 J 



7 ’heory of Probability 


165 


22. A lady declares that by talcing a cup of tea with milk 
she can discriminate whether the milk or tea-infusion was first 
added to the cup It i> proposed to test this assertion by means of 
an experiment with 10 cups of tea. five made in one way, and five 
in the other and presenting them to the lady for judgment in a 
random order 

Calculate the probability that on t' e null hypothesis (i.e ) 
the lady has no discrimination power, the lady would judge corr¬ 
ectly all the ten caps, it being known to her that 5 are of each 
kind. 

Suppose that the tea cups were presented to the lady in five 
pairs, each pair to consist of cups of each kind in a random order. 
How would the probability of correctly judging with every cup on 
the null hypothesis be altered in this case ? 

[ Ans - ^ ToTT’ (“) <1/] 

23. A bag contains 5 white, 7 green, 1? red and 14 black 
balls. The balls are indistinguishable from each other except by 
■colour. A ball is drawn and replaced after its colour has been 
noted, on ten occasions. If any ball is as likely to be drawn as any 
■other, show that the probability that of the ten balls 3 will be white 
3 green, 2 red and 2 black is 

10 ! (JYlJYf\2Y{\*\ % 

3! 3 ! 2 1 2! \38/ U8> \3Z) \38/ * 

24. fa) A group of 2N boys and IN girls is divided into two 
equal groups. Find the probability p that each group will be 
•equally divided into boys and girls. Estimate p, using Stirling’s 
formula [Stirling’s Formula : n ! (2 n x > 2 /j n+1 / 2 e -n as 



<b) A and D engage themselves in a game each step of which 
consists in one of then winning a coan'er from the other. At beg- 
inn ; ng, A has a counters and B has b counters and in each succe¬ 
ssive step the probability of A winning a counter from B is p. The 
game is terminated when either of the two has all a I b conuters 
What is the probability of A's winning when he has n counters ? 

(a) Consul r the two cases (i) (ii) p=4. 

<b) Consider the two cases is p^q. 

(M.Sc. Mysore 65, Bombay 58, Gauliati 63J 



166 


Mathematical Statistics 


25. What is the probability that (a) the birthdays of twelve 
people will fall in twelve different calender mon.hs (assume equal 
probabilities for the twelve months), (b; the birthdays of six people 
will fall in exactly two calender months. 

|Ans. (a) (b) >V J (2*-2) 

26. If n balls are placed at random into n cells, find the 
probability that exactly one cell will remain empty. 

[Ans. "c 2 n 1 «“"] 

27. A man is given n keys of which only one fits his duor. 
He tries them successively isamplirg wiihout replacement). This 
procedure may require 1, 2,..., n trials. Show that each of these n 
outcomes has probability 1 /n. 

[The probability of exactly r tiials fs 


(n - I) ( i- 2) . ...(n-r - 1 ) J 

/i(/i — 1) ... (/r — y— J) n 

28. Find the probability that in a random arrangement of 52 
bridge cards no two aces arc adjacent. 

4y(49— )f49—2)019-3) ] 
52(52-l)(52-,)P2^3) J 

29. The face c. rds (three from each suit) are extracted from 
a full p ick. Out of the remaining 41) cards, 4 are drawn at random. 
What is the probability that (i) they belong to different suits ? 

(ii) the 4 cards drawn belong to different suits and different 
denominations ? 







504 1 

91 39 J 


30. In a factory, machine A produces 30% of the output, 
machine B produces 25% and machine C produces the remaining 
45 0 / of these out put, 1, 15, 2 percent are defective. In a day’s 
run, the three machines produce 10,001) items. An item drawn at 
random from a day’s output is defective What are the proba¬ 
bilities that it was produced by A, B or C ? 

[Ans. 0 19, 0 25, 0 56] 


31. ! sportsman’s chance of shooting an animal at a distance 

r(>a) is u 2 /r~. He fires when 2a, and if he misses he reloads 
and fires when r=3a, 4a,..., If he mibses at distance na, the animal 
escapes. What are the odds against the sportsman ? 

[Ans. //-j-1 : n— 1] 



Theory of Probability 


167 


32. The proabability that a family chosen at random has 
exactly n children is ap n (0<p<\) Suppose also that the chance 
of any child having blue eyes is a (0 <j< I), independently of the 
other. Prove that the probability of a family chosen at random 
having exactly k children with blue eyes is 

33. A bag contains n balls, k drawings are made in succes¬ 
sion, and the ball on each occasion is found to be white, tind the 
•chance that the next drawing will give a white ball when the ball 

-are not replaced after each drawing. 


j^Ans. 


, / (w-1)! . («-2)! 

p— 1 — kc\ -;— -f kc 2 


— ••• + 


( —l — k) 


ii 


Ml " 

34. An urn contains N balls. pN of which are white, the rest 
black. From it are drawn n t balls, r t of which are found to be 
white. The balls are then replaced, and balls are now extracted 
among which r 2 are white balls. Show that the most probable value 
of p for such a sample is p T -=(r 1 -Fr a )/l'h-b' I a)- Suppose that the 
n ! balls which are first extracted are not replaced in the urn. Show 
•that the most probable value o {p for such a sample is a root of 

the equation 


n_ 

p 


rjj—r , 


+ 


Nr 


_ N(n 2 —r. 2 ) „ 


1 - p ■ p N — r x (N-r h )-(pN-r x ) 

35. R red and B black balls are arranged at random. What is 
tr-he probability of finding r red and b black balls. 

(a) in the first set of (r + b) balls, 

(b) in the last set of (r + b) balls, 

(c) in both the above sets, 

(d) in at least one of the above sets. [l.b.l. 

(r + b) \„ (R-r + B-bf !~| ^(R±B)± 
r\b\ (R-r)l (B-b)IJ ‘ R i B ! 

| { R-r + B-b)\ R (r+6)!| ._(* + *)! 

< b > liiTTiTT^?)*- 


1 


Ans. (a) Rc r Be* 


i 


Cb 


r ! b 


rl* 


(c) 


[* 


Cr B t'b 


(r+b) ! (R — 2r + B 2h) D . b ^ 


r ! b ! (R - r)\ 


B-lbJ_ 

i 


R ! B ! 

(r + b)\ 


-' (Ra-B 

R ! B ! 

•(d) (use formula P(A + B) = P(A) -f P(B)-P(AB)). 
414. Use of Calculus and Geometry in probability 
•Let us recall the definition of probability here : 


\£U } 


Mathematical Statistics 


m 


‘If an event can happen in a ways and fail in b ways, and all 
these ways are equally likely to occur, the probability of the: 
happening is a/(a+b) and of the failure to happen is b/(a + t)’ 

So long as a and b are finite, the theory of probability does- 
not call for any mode of treatment other than the processes of 
ordinary arithmetic an 1 algebra II however, we c une across a 
problem which can happen in an infinite number of ways and fail 
to happen in an infinite number of ways all these being equally 
likely, the calculation of a , b and a-\-h m ty call for the processes 
ot integral calculus or geometrical con iterations. 


The only uncountable sample spaces S which we will consider 
here are those which have finite geometrical measurement m[S) such 
as length, area or volume, and in which a point is selected at ran¬ 
dom. The probability of an event A, that is, the selected point 

belongs to A is given by the ratio ^—1 / e 

m[S) 16 ' 



length of A 
Jcngtn oi S 


or P(A) = 


area of -1 
area oi 


or 


P(M = 


volume of A 
volume of o- 


Such a probability space is said to be uniform 

Ex. 1. Two points are selected in a line AC of length a so as 
lo lie on opposite sides of its middle point M. Find Hie probability 
that the distatic between them is less than $ a 


P, Q are the points of trisect ion of the 
arc to be so ch sen in the line AC that 


lAgra B. Sc. t Oj 

line AC Two points 
they a|wa>s lie in ttie 


]- 


Q C 

interval PQ. With A as origin AP a/3, AQ = 2a/3 Let one point 
he so chosen to the left of M that its distance from A is .y and 
another point to the right of M such that its distance from A is 
y. The prob bility of the point distant .v from A lying m the small 

inler * " ious, y a] 2 U,ld of lhe po,nt distunl y from A lying 

ll * S a/T ^' nce x can f* e any where between 
where between M and Q, the required pro- 



Theory of Probability 


169 


f«/ 2 [ 2a l z dx_(1y = 1 [ . f' 3 [ v 1°' 2 

Jo/a jo /2 aj2al2 (c//2)“|_^ J fl /> L Jo/3 

2 A - 3 / 9 ’ 

Ex. 2. /i rod of length a is broken into three parts at random. 
What is the probability that three parts can form a triangle ? 

Let us denote the lengths of the three parts by x, y, a-.v-v. 

Then we see that any pair of values satisfying 0^x<a, 

represents one of the possible equally likely 
cases of dividing the line. Let us represent these pair ot values 
diagrammatically and find that all possible equally likely pairs of 
values of x and y are given by the points with in the triangle OAB 

Favourable pairs of values of x and y are those for which the 
sum of the lengths of any two parts is not less than the length of 
the third part i.e. satisfy the inequ¬ 
alities. 

x-i y^a — x—y.i.e. x by^ a/2 

x+a-x-y^y, i e. y^a/2 £> 

y + a — x—y^x, ie. x<a/2 

The points satisfying the above 
inequalities ob\iously lie in the tii- O 

angle DhC. 

. . . , Area of A DEC_ t 

Hence lequired probability^— a 0 ,— ~^ A b 

Ex. 3. Three points are taken at random on the circumference 
of a circle. Find the probability that the / (tints lie on a semi - 

circle. 

Suppose the length of the circumference is 25. Let the three 
points chosen on the circumference of the circle be a,b t c. Let x 
deno'e the clockwise arc length 
from a to b. and let y denote the (o,2A) 
clock wise arc length from a to 
c. Thus 

0<x<2? and 0<y<2*. 

Representing the pairs of 
values (x. y) diagrammatically 
we find that their sample space 
•S is the square OACB. 









170 


Mathematical Statistics 


Favourable pairs of x and y are those that satisfy the set 
ACS by any one of the following conditions. 

(i) x,y<s (ii )x,y>s 

(iii) *<s and y-x>s (iv) y<s and x ) >s 

The set A obviously consists of those points for which a,t>,c 

lie on a semi-circle. Thus 

area of QODMr+area of QHEC^+i area 

of A DAE A- area ot &FBG 

required probability = area ol the square OACb 

5*+J*+fr*+*** _a 

4.v- * * 

Ex 4 Two points are taken at random on a given straight 
line of length a. Prone that the probability of their distance exceed¬ 
ing a given length c (<«) Is equal to O-da)^ ^ ^ ^ 

Let the distances of the two points taken at random on the 

line from one end be x and y, Then 

0<.\'<a and 0<^<fl. 

We are required to find the probability of I y—x \ >c. 

The sample space 5 consists of the points lying in the square 

determined by x = 0, y-0, x=a and y=a. 

The event will happen if the 
points fall in the regions of the 
square determined by 

(i) y— x>c and (ii) x—y>c. 

These regions obviously are the 
triangles DEC and FAG. 

Thus required probability 

Ita — (a —rY“ 


_ \ 

/ t: 

hWI 
*. •.../ • w 

E>(Cl 

n 

/7 

£ 

7 




yfw 

/v.:*:.... 

* s 


O 


A (a , o) 


a~ 


sion 


In fact this is only performing the integration in the expres- 

p ={\°, V dydx+ t \7 dxdy ]\l dxdy - 

F.x. 5 Button's Needle Problem A smooth table is ruled 
with parallel lines at distances a aftan A needle of length l{<a) 
is dropped on the table. What is the probability that it will cross 
one of the lines ? 




Theory of Probability ^ 

Let x be the distance of the centre of the rod from the nearest 
line, Q the inclination of the rod to a perpendicular to the par¬ 
allels; then as all values of x and Q between their extreme limits 
are equally probable, the whole number of cases will be represen¬ 
ted by 

{ a/2f«/2 . -a 

dx dO = 

0 J -w/2 ^ 

Now, if the rod crosses one of the lines, - - 

we must have //2 cos so that the c , 

favourable cases will be measured by 

nl - eo ' • . f" /2 ~ 


. ■ 


) -/2 fI/- eo* » 

dO dx 

-vhi J1/2 




112 cos 0 


dO 




/. 


2/ 


Thus the probability required 

Exercises 

1. (Bertrand’s problem) If a chord is drawn at random in 
a given circle, what is the probability that its length is at least 
equal to the radius ? 

(Ans This problem has no unique answer] 

2. A point P is taken at random in a line AB of length 2a, 

all positions of the point being equally likely. Show that the ex¬ 
pected value of the area of the rectangle tr.PB is 2/3 a 1 and that 
the probability of the area exceeding i or is 1/V-- 

(Agra B Sc. 58] 

IHint Let AP=x, then the chance that P lies between x 

and x + Sx is ” Expeclcd arca= "J„ x{2a ~ x) Ja i a ' 

I • 


2ax —x 2 


a 


(x—af 


T or 1 x - fl < 7'2 / a { 1 ~^ 72 ) 




• (■ nM 


• , u Cfld + l/v/2) 4 *=± 1 

required probability = J __ , /v , 2 ) 2<I ^ /2 J 

3. From a point /4 on the circumference ol a circle of radius 
a, a chord AP is drawn in a random direction Show that the 

expected value of the length of the chord is 4<//r., and that the 


172 


Mathematical Statistics 


vairance of the length is 2a 2 1 — -pj. Also show that the chance 

is £ that the length AP will exceed the length of the side of an 
equilateral triangle inscribed in the circle. [Punjab B. Sc. 59] 
4. Two numbers, x , y. are chosen at random between 0 and 
a : find the chance that the product xy shall be less than a*/4. 

[Ans. log 2] 

5 If a triangle is formed by joining three points taken at 
random in the circumference of a circle* prove that the odds are 
3 tv) 1 ag linst its being acute angled. 

6. Given that u, h are any positive quantities of which 
neither is >4 ; what is the probability that when real values are 
ass gned to them at land mi, the roots of the quadratic 

a 2 — ax +6=0 

shall be real ? [Ans. $] 


7. Two points are selected at random on a line of length a. 
\Vh it is the pro bability that none of the three secticns into which 
the line is thus d vided ie less than a/4 ? 

fo/- r 30/4 /fa 

Hint. 1 1 tfx dyj l d< </p=l/16 

J.//1 Jm/I + v ' Ju J« 

8. Two independent events, A and B, must each happen once 

and once only in the future The chances of their happening at 
time t fr > n he present date are (log dt and (log 1 /h)b* dt 

respec iv-Iv. where a and h are constants. Find the chance that 
the two events h ippen in the order AB. 




j^Hint The chance that B happens between now and time t 

from now—l (log 1 ' jh)h x d\\ Therefore the chance that B has 

J U 

not happened by that lime = I — (log 1 /h)h* dx. Tne chance 

that A h ppens at the the moment of time dt = (log l/ ( 7)n‘ dt and 
the chance that A happens at that moment B not having happened 


= 1 


-j (log I fh)h x dx'j (log \la)u' ct. 

total chance that the events happen in the order AB 

^ • j i 


r [-s' 


(log 1 jh)h x d v 


.log l/a> f dt = 


log a 

log a-Hog h 



[Probability Generating Function, Moment Generating Function 

and Cumulative FunciionJ 


5 1 . Mathematical Expectation. Definition Let X be a discrete 
one dimensional random variable with possible values .Xj, x .>, 

x it .and the respective probabilities /i*i),/(.v 2 ), *••*7\- Y »).. 

Suppose if there exists a single-valued real function ^(x) o( x, 

defined for all values X=x it i'=l, 2. such that the sum 

X | «/'(x<) | /(*,) is finite, we shall say that the mathematical expec- 

t 

tation of 0 (JO, in symbols E (0(A')] exists and is equal to 
(1) E I'M*)) = S0(*d/(*<) 

a 

In other words, the sum (1) will be called the mathematical 
expectation of «/<(.*) if it converges absolutely. If X \ <M-y.) I /(•'») 

is divergent, we shall say that the mathematical expectation of V'(v) 
does not exist. Cleaily the mathematical expectation of any <J\x) 
always exists if X is a discrete random variable with a finite 
number of possible values. 

In particular if <!>{X)=x, we have 

Ef*/ = 27 XiJ(Xi) 

i 


provided E I x t I f(x t ) converges. 

i 

For the mathematical expectation of X we shall use the symbe I 
x or /x so that 

E (X) — x or fx 

>4Iso /* 2 =E [X-E(X)f 

and is called the variance ol the distribution of X and is denoted 
by Var (X) or simply FiX). The standard deviation u simply 

+ V / X 2 and is denoted by v. 

Ex. 1 . Find the expection oj the number of points obtained in 
one throw on a die. 

Hence the random vaiiahle X takes the values 1, 2, 3, J , 6 


with probabilities ~ for each. 





174 


Mathemetical Statistics 


Hence £(JT)=1.|—!-2. i- 4 3 


1 a 7 

-T & i= T 



Ex. 2. Find the expected number of throws of a pur of dice 
upto and including the throw which produces the first double six. 

The random variable X is discrete and has the infinitely 

many possible values 1,2,3 .with the corresponding 

probabilities. 

1 35 1 / 35 \ 2 1 /35V - 1 J_ 

36’ 36 36 \36/ * 36* **’ \36/ * 36’~ 

1 1 1 A 

for the probability of the rowing a double six is ana 


..155 

of not throwing or double six is -.-£== 3 ^ 


Hence 


E(X)=X i 


<-1 




This series is absolutely convergent if it converges at all, as 
all possible values of X are positive* 

If we denoted ^ by for convenience’s sake we have 

36 


EOT = 3 jS i x‘-» 

3o <-1 

_JLTA £ X i] -L\A -3-1 

~36 L dx r-o J 36 Idx \-x J 

_ 1 f _!_1_L [-!-1-36 

-;'6l (1— — 36 L U -35/36)" J 

F.x 3 . Show that the expectation of the number of failures 
preceding the first success in an ind finite series of independent trials , 

with constant probability p of success is 1 j 

[Agra 64, M A. Bombay 53] 
Hence the random varirble X denoted ihe number of failures 
preceding the first success and takes infinitely many values 0, 1 , 2, 
..with corresponding probabilities < ?W*-»*/‘ 7 , »***>where 

<7 = l-p. 




Mathematical Expectation 


175 


Hence 


OO 


EW= 27 

o 


* tfP 



V / tf*" 1 
<=o 





- U- £ •' ] 


) 


Ex. 4 /4 card w drawn at random from a pack of 52 cards. 

If aces count one, king, queen and jack count ten each , and others 
count at their face value, show that the expectation of the value of 

the card is 85/13. 

Here the random variable X takes the values 1,2,3,...,9 with 
probabilities 4/52 for each as in a pack of 52 cards there are four 
cards of identical value. Also A' takes the value 10 with pro¬ 
bability 16/52 as there are 16 cards of value 10. 


Hence 

E{X) — (1 + 2+34--.. + 9) ^+10.^ 


9.10 1 

2 13 



85 

13* 


Ex. 5. Give an example of the distribution of X for which 
E{X) does not exist. 

Let X assume the value (-l) lfl 2‘/i with correspondsg pro¬ 
bability (D* where /= 1,2,3,4. Then 

2< i co i 

27 | Xi | fiXi) -27 — E —— 

< / Z f 1 / 

1 i 

is a divergent series, and the expectation of X does not exist, 
although 

27 *,/(*.) = ? (- 0 - 4 - 

i <-l 1 


converges. 

Ex. 6. A person draws cards one by one until he draws all the 
aces. What Is the expectation of the number of cards to be drawn ? 

Suppose all the aces are drawn in x draws. This means that 
in (x— 1 > previous drawings 3 aces must have been drawn and in 



176 


Mathematical Statistics 


the x th draw the remaining ace must be drawn. The probability 
of such an event is 

1 _4 48 ! (x — 1 ) 1 (52-x-f-l) 1 1_ 


4 c 3 x 48 « 


Jf-4 


52 


Cb_1 


52— ijc-t-i) 52 ! (x-4) ! (48-X + 4) 1 '52-x+l 
_4 (x— 1) (x—2) (x-3) 


52.51.50.4 

Hence the required expectation 
4 (x—1) (x-2> (x—3) 


52 


=2 x. 

8-4 


52.51.ou.49 

fs 2 


52 


52 


52 


52 


_±_ I Z 

. 51.50.4> I e—4 


x 4 —6 E x 3 ^-11 2 x 2 -6 2 

<c-4 «-4 a* 



Exercises 


Ex. 1 Two persons A and B play a game as follows : 

A tosses two unbiased coins. If he gets z heads, B pays him 
2 rupees. If he gets 1 head, B pays him 1 rupee. If he gets no 
heads, B pays him nothing. Is this a ‘fair' game ? Find the 
‘entrance fee’ to the game if this is to be a fair game. 

Outcome space S’={(////), (HT), (77/), (TT)} 

[Random variable (net gain)X= 2 1 10 

Corresponding probabilities= (|) 2 , (i) 2 , (J) 2 , (£) 2 

.’. E(.Y) = l t and the game is not fair. 

For the game to be fair, the ‘entrance fee’ should be 4 rupees]. 

Ex 2. The entry fee for each throw of a pair of dice is 15nP. 
A player wins in each trial the number of nP’s equal to the pro¬ 
duct of the scores on the dice. Is it profitable to participate ? 

[No, the expected value is only 12}nP]. 

Ex 3 A and B alternately throw a die, the game terminating 
in a win for A if he throws a 1 or a 6 , or in a win for B if he 
throws a 2, 3, 4 , 5. Find the probability that A wins and, if he 
wins, find the average number of throws he takes, given that A 
commences the play. [Ans. 3/7] 

Ex. 4. (a) A man tosses a coin until a heart appears and is 
paid a number of rupees equal to the number ot tosses he makes. 
What is his expectation ? 



Mathematical Expectation 

(b) A bag contains 2 n counters, of which half are marked 
with odd numbers and half with even numbers, the sum of all the 
numbers being S. A man is to draw two counters. If the sum of 
the numbers drawn is odd, he is to receive that number of rupees, 
if even he is to pay that number of rupees. Show that the expe- 

g 

ctation is n (2n— \) rupees ’ 

[Sarder Patel Univ. 68] 

[Hint. The sum is odd if one counter drawn shows an odd 
number and the other an even number. The gain in this case is 

"£ L x^c LX -x 2 . The sum is even if either two counters drawn 

2 "c 2 2 n 

show odd numbers or even numbers. 

2 * n c 2 S 

The loss in this case is ^-2 

n* S _ n(n — 1) S_ 

Hence required expectatnn = w(2fj - yy n n (2n—\) ’ n 

nS (n—\)S _ S ~j 
“ n(Zn— \ ) m( 2 /i —0 n{tn— 1 ) J 

. 5 

Notic that the probable value of one counter is ^ 

Ex. 5 Balls are taken one by one out of an urn containing 
a white and b black balls until the first white ball is drawn. What 
is the expectation of the number of black balls preceding the hrst 

white ball ? 

[Random variableA r =0 > 1, 2, 

n a b a 

and corresponding • a +6 ' a + b— 1 ’ 

b b- 1 a . 

a+b'a+b— 1 a + b—2 

. a r h b<b - n _ 

Ans. a + b y a ^. b _[ (a-\-b—l)(a-tb—2) 

h(h-))(b-2) __j_ j 

+ 3 ( a -\-b \){a-\-b— 2)[a+- b — 3) ” J 

6 . (a) A and B play a game in which A's chance of winn¬ 
ing is p, white B's is q, where /?+*=!• The y have a co " test ’ 
the winner being the first to score two consecutive successes. Prove 

that the expected number of games is 

( 2 +pg)(l-pg)~ l - 



178 


Mathematical Statistics 


(b) A lot is known to contain 2 defective and 8 non-defective 
items. If these items are inspected at random, one after another, 
what is the expected number of items that must be chosen in order 
to remove all the defective ones 2 

Ex. 7. If X is a random variable with mean ^ and standard 
deviation prove that the standard deviation of the random vari¬ 
able is {EUX-fz)*}-^} 1 ! 2 

Ex. 8. Two dice are thrown together and the scores added. 
What is the chance that the total score exceeds 8 ? 

Find the mean and standard deviation of the total score. 
What is the standard deviation of the score for a single die 7 

[Note: If for a participant in a game the expectation E(X ) 
is positive, the game i<* called “favourable” and if E(X) is negative 
it is called “unfavourable” for this participant]. 

Ex 9. A player tosses a fair die. If a prime number occurs 
he wins that number of rupees, but if a non-prime number occurs 
he loses that number of rupees. Is the game favourable ? 

[Ans. No.l 

Ex 10. An urn contanis 3 white and 2 black balls. A and 
B agree to play the following game. Each person draws two balls 
at a single drawing, the- balls being replaced after each drawing. 
B will pay A the amount of 5 for each white ball and Rs 2 
for each black ball. 

(a) What is the mathematical expectation of the player A ? 

(b) How much should A pay B for the drawing of a white 
ball and a black ball so that their expectations are the same ? 

[Ans. Rs 7.60, Any value of x and y satisfying the 
equation 3x + 2y — 1 9 = 0] 

Ex. 11. A and B play a game of tossing a die in succession, 
A beginning first and the game terminating when one of them gets 
an ace first. He who gets an ace first wins the game and also wins 
an amount in rupees equal to the total number of tosses made by 
them together. Find the mathematical expectation of the winn¬ 
ings of each of them 

Ex. 12. Two players A and B alternately roll a pair of un¬ 
biased dree. A wins if on a throw he obtains exactly six points 
before B gets seven points. B winning in the opposite event If A 
begins the game, prove that his probability of winning is 30/61, 
and that the expected number of trials needed for A's win is 
approximately 6. 



Mathematical Expectation 


179 


5'2. Definition. 

Let A" be a continuous one-dimensional random variable 
with the probability density /( x ). The expected value of X is de¬ 
fined as 

OO 

x f(x)dx. 


E{X)= J“ 


OO 


It may happen that this improper integral does not converge. 
Hence we say that E{X) exists if and only if 

I x | f(x) dx 


i 


-OO 


is finite. 

Ex. 1. To find E(X), if X be uniformly distributed over the 
interval [a, b]. 

The probability density of X is 


/(*)= 


1 


, a^x<b 


Hence 


i 


b—a 

=0, for other values of x. 


OO 


- OO 


I x I f(x)dx= \ \ x \ f ( x)dx 


1 


I x | dx*=a finite number 
and the expectation of X exists and is equal to 


b—a i a 


E{X) 


I*-- 

)*b-a 


dx 


1 \x 2 \b = a 

— a 2 a 


+ b 


a b 

which is the mid point of the interval frt, 6]. 

Ex 2. Find the expectation of X when X follows Ihe normal 
probability distribution with the density 

1 — (x —m) 2 /2o 2 


/(*) = 


<V( 2 *) 

The integral f I x | f(x) dx 

) -OO 


, — oo<x<oo. 


exists, as may be easily verified. Hence th* mathematical expecta¬ 
tion of X exists and is equal to 


1 f 00 

E W=rr, x 

a V (2-) J_oo 

i r°° 

y/ilrr) J 


- (x — m) 2 /2c 2 


[ 


dx 1 Put y = 


x — m 


) 


— 

(m + oy) e dy 


Now 


00 -hy* , 

<> 2 dy— 

-OO 

Hence £(X) —m. 


= V(2*), |° 


00 — i \’ 2 

ye *> dy=Q 

OO 



18 a 


Mathematical Statistics 


Ex. 3. The Cauchy probability law is specified by the pro¬ 
bability density function 



n 

'b*+x*' 


— oo<x< + oo 


The mean E(X) does not exist, since the integral 

30 | x I 


£[l *'HrL 


dx=oo. 


oo b 2 + X : 

5 3. For a two-dimensional discrete random variable ( X , Y ) 


with the possible values and the probabilities /(*<, v#)» 

f=l,2,...,/= 1,2, .., the expectation of a function 4> (X, Y) is 
defined by 


F [</. (JT, y)]=F 27 if> <x t , y,) f (x it y,) 

* t 


if the series on the right-hand side is absolutely convergent. Simi¬ 
larly in the continuous case 

E[ 4 > (*, Y)]=\°° [°° 4 >(x,y)f{x,y)dxdy 

J-oo J -oo 

i oo r oo 

\ I <A (*, y) I / (*» y) dx dy 

-OO J-oo 

has a finite value. 

The extension of these definitions to any number of dimen¬ 
sions is obvious. 

5 4. Addition Theorem for Mathematical Expectations. 

1 heorem 1 Let X and Y be any two random variables. Then 

E(X+Y) = E<X) + E(Y). 

Proof. Let the random variate X assume the m values 
x Jt x 2 * ■ yXm with probabilities pi,p a ,...,Pm and Y assume the n 
values yi,y 3 , . ,? n with probabilities p x \ p* , .: t p* • The sum X+ Y 

will assume mn values of the type .r<+y# (/= 1.2 . ni ; j= 1,2,...,«)*■ 

since any of the m values of / may be associated with any of the 
n values of j. 

Let pa denote the probability that X assumes the value x* 
and, at the same time, Y assumes the value y*. Then 

E(X+Y)==2 2 pa (*<+>’;) 

<-i i-i 

==£ £ pu x* + E 2 pa )’t 

i i 4 i 

= 2 Xi (2 pu)+2 y, (2 Pi1 ) 

t i j < 


Mathematical Expectation 


181 


Now S pu=Pii-\-pi%+ is the sum of the probabilities 

j 

that X assumes the value Xi while Y assumes one of the values 
yuy%> --,yn, and so is equal to Pi , that is, 2pa—p< and similarly 

9 

0 

2 Pu—Pi. Hence E(X+Y)=2 i x i p i + Zf yiP',=E'(X)+E(Y) 

This relation is also valid when X and Y are random variables 
of a continuous type. 

More generally, for the expectation of the sum of a finite 
•number of discrete random variables (not necessarily independent), 
•one obtains 

E<X l +X z +...+Xn)='E{Xi) + E(X 2 )+...+E(Xn) 
provided that the expectation of the individual variable has a 
finite value. 

Ex. 1. Two dice are thrown ; find the expected value for the 
sum of their face numbers , 

Here the joint probability pa — 6 - 6 = 

-and the marginal probabilities p t and p/ are 

/><=«, Pt=i 

E(X)=E(Y)=\ (l + 2+3+4 + 5 + 6 ) = f 
£{X+Y)=-h C 2 + 3+4 + 5 + 6+7 

+3+4+5+6+7+8 
+4+5+6+7+8+9 
+ 5 + 6+7 + 8 + 9+10 
+ 6 + 7+8+9+10+11 
+7 + 8 + y + 10+ 11 + 12) 

- A (27+ 33 + 39+45 + 51 + 57) = *■ .252=7 
Directly E (X+ Y)=E{x)+F(Y) 

= 2 + 2=7. 

Ex. 2. An urn contains a while and b black balls , and c balk 
are drawn. What is the mathematical expectation of the number of 
the •white balls drawn ? 

We attach the value 1 if a white ball is extracted and the value 
0 if a black ball is extracted. The number of while balls drawn 
■will them be 

S — Xj + x 2 4“ • • • Xo 

But the probability that the ith ball removed will be white 

a 

when nothing is known of the other ball is ; therefore 



1 S 2 


Mathematical Statistics 


E(*i) = 


a . 1 + ~..o 


a+b’ a+b 


a-\-b 


for every i 
Hence 


E(s)=>E(Xi+X 2 + . +Xc) 


■S E{Xi) 

i° i 


ca 


a + b 


5'5 We derive now an extremely important expression for the 
variance of a probability law : 

V(X)=E[{X- E[X) 2 ]= E[X*)- E 2 [X] 

To prove, we write, letting p=E[X) 
o-=E[{X- p)- = EX 2 - 2yX+ +•) 

= E[X~]-2H E[X]+p 2 
= E[Xf - 2 /** ■+ n 2 = E[-V“] - pr 

Ex. 1. The Bernoulli probability law with parameter p in which 
0 < P < 1 Is defined by a random variable X taking the value 1 
with probability p and taking the value 0 with probability q. Find 
the mean and variance of the random variable X • 

Clearly 

E(X) = \ .p+0.q=p 

E{X 2 ) = \ 2 .p + 0 2 q=p 

V(X)=E{X-) — E 2 (X)=p—p 2 =pci t since I —p=q 
Ex. 2. The binomial probability law with parameters n arulp 

where w = l, 2 ,.. and 0 < p <. 1 is specified by the probability 

mass function 

p(x) = n c m p x q n ~ x for x = 0 , 1 , 2 
= 0. otherwise 

where q=\—p. Find the mean and variance of the random 
variable X. 


E[X]=?, rp{r) = Z 


f-0 


r-o 


r nc r p r q n r 


= n/>£ "-'Cr-x P r_1 <7C”—i>— ( r— 1 > 

r-1 

= np (p + q) n ~ 1 =np 
The mean square of X is given by 

£[.V-] = r r 2 nc r p T q n ~ T 
r-o 




Mathematical Expectation 


£83 


=27 [r (r —l) + r] n c, p r q n ~ T 

r-0 

=£ r (r — 1) "Cr/) r ^-J-£IATJ 

T -0 


= n(/i-l)/> 1 2 * * 5 27 «-V r .* Z^" 2 q<«-*>-l*-''>+np 

r—2 

= n{n— \)p- (p + q) n ~ 2 +np 
=n(n —\)p 2 + np=n-p 2 4- npq 
Therefore 

<y *« J /(jr)=£IK 2 ]-E ? m 

=nV*4 -npq - (np ) 2 =npq. 


Exercises 


1. A ball is drawn with replacement 10 times an from urn contain¬ 
ing 2 white and 3 black balls. Find the probability distribution of 
the number of white balls drawn. Find its mean and variance. 

2. From a group of five persons consisting of three men and 
two women, two are selected at random. If X represents the 
number of men, among tbe selected persons, find the expectation 
and variance of X. 


[ 


Hint. 


Corresponding Pr 


0 , 1 , 

2c a 2r t 3 c$ 

5(V 5 


:1 


'fj 5 c 8 

3. A coin is tossed until it turns up head. Show that the 
•mean number of tosses required is 2. Find the variance of the 
number of tosses required. [Ans. 2] 

3. A uniform die has «+ 1 faces numbered 


1 2 n~\ 


respectively. Assuming it is equally likely to fall with any face 

•upper most, find the expectation and the variance corresponding 

to the number on the uppermost face. [Ans. $, (n + 2)/12n] 

5 6. Multiplication Theorem for Mathematical Expectations 
Theorem 2 : The expected value of the product of two indepen¬ 
dent variates is equal to the product of their expected values , or in 
symbols E (XY)=*E[X) E{Y) 



I ft* 


Mathematical Statistics 


The variates, X and Y are independent if the probability that 
either of them will assume a prescribed value does not depend on 
the value assumed by the other. 

Proof. Let X and Y be two random variates, the first of 
which assumes the m values xt with probabilities p t (/= 1, 
and the second assumes the n values }'t with probabilities p't 
(j= I, 2,The product XY will assume the mn mutually 
exclusive values xtfs with probability PtP'f. Hence 

m n 

E(XY) = Z X Pi p't x t yi 

<-i j-i 

= ? PiXi ( h p '* y *) 

=Z PiXi E(Y) 

4 

= E( Y) Z p^ 

= E{X) E(Y). 

Obviously if E(X) = 0, then E ( XY ) is also z to. 

By induction one can obtain the result that the expectation 
of the product of a finite number of discrete independent random 
variables is the product of their expectations. 

E(X x X y. X n )=E(X x ) E(X 2 ) ..E(X n ) 

Note that the independence of the variables is required for 
this equation but not for equation 

E{X x +X t + ...+ X n )=E(X i ) + E{X t ) +... -1- E(X n ) 

Ex. Find the expected value of the product of points on n 
dice. [Ans. (| "} 

5 7. Co-variance and Correlation Coefficient. 

Let us consider bivariate distributions. Let (Xi, X%) be a 
random variable having as its sample space the real plane. The 
mean and variance for Xi are : 

and for X z are : 

P 2 =E,X 2 ), o., 2 = E { X 2 —^ 2 ) 2 } 

There is yet another simple measure which tells as how the 
values of X x are lelated to the values of X 3 . For this, we consider 
the covariance of X x and X a : 

Cov {X u XJi = oE {(X x fi x ) (,Y a -/i a )}« 




Mathematical Expectation 


185 


If X-i and X 2 are independent, then 

oi 2 =E {(Xi—pi)} E {(X 2 — f i 2 )) = 0 
since E {(X x —p-i)}=E (tA' 1 )}—f t i=/ x i ^i^O. 

Thus the covariance of two independent variates is equal to 
zero, but the converse is not true. This we show by an example. 

Let U and V be two independent variates for which a ( / ==<J y 2 - 

Let us take 

X=U+V, Y=U-V 

so that X-U+P, X=U-V [X-E{X)] 

Now Cov {X, Y}=E {(A' X) {Y L)} 

=E {{U-U+V-~V) (U-U—V+V)) 

=E{{U-TJ)'-{V-Vn 

= E{<,U-U)-)-E{(V-VY-} 

==a c/ 2-or ^ =0 * 

But X and Y are not necessarily independent, for if we consi¬ 
der that U and V denote the number of points on two dice, then 
either C/+ V and U—V are both even or both odd, and so conse¬ 
quently X and Y are dependent. 

For the covariance, we have the simple formula 

a 12 =E {X x X. -\iiX 2 -^X 

= E {XM-^2 
= E {XiX 2 }— E {X x } E {X 2 }. 

A measure of the degree of dependei ce between X x and X 2 is 
given by the correlation coefficient 

o j 2 _ Cov {X | , X?} 

9 lt= o x o~ 9 ~ {Var ,Xi) Var(X 2 )Y' 2 ' 

[Sometimes P 12 is also denoted by r la . Obviously r 12 ~r 21 and 
has the same sign as o r ]. 

Theorem 3. The permissible values for the correlation coeffi¬ 
cient are confined to the interval {- 1 , + 1 }, that is - 1 <P^ + 1 . 

We consider the random variable . 

Z = a (Xi~pi) J rb {X 2 — P a) 

where a and b are real parameters. 

Then E ( Z Z ) = E {[a (A r 1 -/x 2 ) + b (Af 3 -^)F} 

= a 2 ** x~ -\-2abo 12 -f b J af 



186 


Mathematical Statistics 


Now E(Z 2 )*> 0 =► Gl * 0,2 >0 

<*i* c 2 2 

i.e. ^i 2 ^ 2 2 -^i 2 2 >0 or ^i 2 2 —^i 2 ^ 2 2 <0 

whence 1 or P 2 ^i i.e. -1<P< + 1. 

In the extreme case, when p=:fcl, there is complete dependence 
between the variables X x and X 2 . 

Note. When the variables X x and X t are independent then 

E{X x X 2 ) = E{X 1 } E{X 2 } and 
Var {X x ±X 2 } = E\{X x ±X,) 2 ]-[E (X x ±X 2 )]* 

= E{X x 2 ) + E( X. 2 ) ± 2 E(X x X 2 ) 

-[E 2 (X x ) + E- (X 2 )±2E(X x ) E[X 2 )) 

*= E(X x 2 ) - E 2 (X x ) 4- E(X a 2 ) - E 2 (Xo) 

= Var (X x ) + Var (X t \ 

This result can be extended to obtain the variance of the sum 
of a finite number of inpependent random variables : 

Var {X 14-^2 4-... +-Yi»} = Var (X x ) + Var (X 2 ) + ... + Var (X n ). 


Exercises 

Ex. 1. Prove that 

(i) Cov (X+a, Y+b)=r--Cov ( X , Y) 

(ii) Cov (aX, by)=ab Cov (X, Y) 

m c ° v (■'*/• Y ^ ? y~ c °'’ {xY) 

Ex. 2. If .\ and Y are two independent variables, then 
Var (XY) 

[WTaviT'^' <r 2 +c'x+c*r 

where „ = c ,.- VWr» 

E(X) ' Cr E- Y) 

are the so-called coefficients of variation of A'and Y. 

[B. Sc. Poena 1967] 

Ex. 3. X , Y and Z are independent random variables with 
E(X) = L(Y)= 2 and E(Z)=- 3. If V(X)=\ 

and V(Y)=V(Z) = 3 

Find (i) E(X+Y+Z) (ii) E[(X) ( Y+Z )] (iii) V(3Y+Z) 
which if any, of your answers are dependent essentially on the 
independence of random variables. 



Mathematical Expectation 


187 


Ex. 4 (a) If X and Y are two random variables, prove stating 
the necessary conditions 

(i) E(X+Y)=E(X) + E(Y) 

(ii) E(XY)=E{X) E(Y) 

(b) Find the expression for the variance of the sum and 
difference of two variates, when 

( 1 ) they are independent ( 2 ) they are correlated. 

58. Conditional Expectation (Marginal and conditional 
probability distributions) 

Marginal and conditional probability distributions are intro¬ 
duced for two dimensional random variables by the following 
definitions. 

Definition (A). For a discrete two-dimensionl random variable 

(X, Y) with the possible values x u x 2 ,... for X and y u >’ 2 , ••• for Y, 

and the probabilities f (x it yi), called joint probabilities of (.v<, >’/) 

the marginal probability that X=X{ is 

( 1 ) g{x t )=2 f(x}> y } ) for /=!, 2 ,... 

i 

and the marginal probability of Y=)'j is 

h(yj) = 27 f(x t , yi) fory =1,2,... 

i 

(A') If (X, Y) has the joint probability density/ {x, ;•), then 
the marginal probability density of X is 

g(x)=f f(x,y)dy 

J - OO 

and the marginal probability density of Y is 

/*0)=i f{x,y)dx. 

J -OO 

Observe, since yj) = 1 

i > 

Z f{x lt yi) = 1 

i i i 

and 2 h(yj)=££ f(xt, yi)=\ 

i * t 

I cc r oo 

\ /(*» y) dxdy=\ 

- a J -oo 

f OO f OO (• oc 

l g{x)dx= \ 1 f{x, y)dxdy= \ 

J-00 J - oo J - 00 

Similarly ( h(y)dy= 1. 


-OC 


188 


Mathematical Statistics 


Definition (B). The Conditional probability of X—x t for given 
Y=)’j is defined for all yj such that hlyj)>0, and is equal to 

g(x, I for ,=1 > 2 -> 

and the conditional probability of Y=vj for given Af=x< is defined 
For all x t such that g(x<)>0, and is equal to 

hiys | Xi )~ ^ X g[x ' j ~ ^° r I= ^* = 

Notice S g(.x< I *)- S/to, 

Similarly X h(yt | x<) = l 

(5') If(Y, Y) has the joint probability density f{x,y) then 
the conditional probability density of X for given Y is defined for 
any v such that /i (;')>0 and is equal to 

, , x fix . v) 

1 >>- wr 

and the conditional probability density of Y for given X is defined 
for any x such that g(x )>0 and is equal to 

Natic j°° g(x\y) Jx=~jj^_ M fix,y)dx=\, 


fix , y) dy = 1 


r co If 30 

and similarly 1 hly I x) <0 ’= J_ qo 

The random variable X is called independent of the random 
variable Y, if 

tf(x I }’) = *(*) 

for all values of x and v. 

Theorem. 4. X and Y are independent if and only if fix, y) can 

be factored in the form 

fix, y) = ? (x).«/»(;•) 

whtre <p(x) depends only on x and «/< (v) only on V. 

Cor. X and Y arc independent if and only if 
/(x, y)=g lx).hly). 

Ex. 1 fllustration that 9 = 0 or cov IX, Y) = 0 does not imply 

independence between X and Y). 

Suppose that (A', Y) has a joint probability distribution given 

by the table. 



Mathematical Expectation 


189 


Y/X 

-1 

0 

1 

Sum 

-1 

1/8 

1/8 

1/8 

3/8 

0 

1/8 

0 

1/8 

2/8 

1 

1/8 

1/8 

1/8 

3/8 

Sum 

3/8 

i/8 

3/8 



(a) Show that E{XY)~tW W > ana r 

(b) Indicate why X and Y are not indepedenl . 

Sol £m=(—l)( 3 / 8 ) + ( 0 )( 2 / 8 ) + (l)(3/8)--0 

r(V\ —f — l)(3/8) + (0)(2/8) + (l )(3/8) = 0 

£(A'y)=(-i)(-i)( l / i 5 )+('‘i)(°)(i/ 8, + ( “ 1)(1)(l/8) 

. (0)( — l)(l/8)-H0)(0)(0)+ (0)(l)(l/8) 

J ( (T) ( ( — 1)(. /B)+(1 )(0)( 1/8) + U)' 1)(1/8) = 0 

Thus E(XY)=E(X)E(Y) 
which implies that p or cov (X, Y) — 

(b) X and Y are not independent, since 

PIY _ 1 y=—\)^P^X=-^)E(Y=-i) 

for P(X== — \* y*= —1)*=W 8 , W -l)-3/8.«ir- D-3/8 

and l/8^(3/8) 2 , 

Ex. 2. The faces of two dice are marked as follows ; 

Die No. 1: 0, 0, 1, 2. 2, ^ 

The Z Tee Ire tluL together!and'he sum of the integers 

on the faces turning uppermost denotes the va ue o^arm 

Find the probability function oj X. 1 

What is the modal value of XI 

We obtain the finite equiprobable space S consis ing 
36 ordered pairs of numbers between 0 and 4. 

S={(0, 0), (0, 1), • * (3, 4)} 

Now let X assign to each point (a, ft) in 5 the sun, of 
numbers, i.e. X(a,b)=a+b. Then X is a.so a random var.able 

on S with image set 

X(S) = { 0, 1 , 2 , 3, 4, 5. 6 , 7). 

The distribution g of X follows : 5 6 7 

*!*,) 2/36 5/36 6/36 8/36 7/36 4/36 3/36 1/36 

We obtain, for example, g(4)=7/36 from the fact that (0 ; 4), 
(1,3), 12, 2) and (3, 1) are those points of S for wh.ch 

the components is 4, hence 

g( 4 ) = p(X —4)— £[{(0, 4), (1, 3), (2, 2), ( 3 , 1)}1 



190 


Mathematical Statistics 


= 2-+L + l_+2_=L 

36 36 36 36 36' 

For the probabilities of the various numbers on dice (1) and 
2 are given by 

I 


P(x= 0) =2/6 
P(. y=l) = |/6 
P(x=2) = 2l6 
P(x=3)= 1/6 


I 


II 

P(y~0)=]f6 
P(y= l)=2/6 

/ > (V=2)«= 1/6 
/>(;,= 3 ) = 1/6 
/>(^=4) = I/6 

when Af=3 the probability 


In the above table *?(3)=8/36 /.<?. 
is greatest. Hence the modal value of A' is 3. 

5 81 Conditional Expectation Just as we have defined the expected 
value of a random variable X (in terms of probability distribution) 


OO 

as £ p(xi) or 
1 


x> 


“OO 


* f(x), so we can define the condition of 


expectation of a random \ariable (in terms of its conditional pro¬ 
bability distribution). 

Definition (a). II (Af, }') is a two dimensional discrete random 
variable, the conditional expectation of AT for given Y=yj is de¬ 
fined as 


OO 


£(-V I yj)=s xig(x t | Vi) 

1 

Similarly the conditional expectation of Y for given X=x { is 
defined as 

E( Y | x { ) — X 7i h'yj ! ,v<) 

J*1 


(b) The function of v 


E(X I y) = 


xg{x I y)Jx 


— CO 


is called the conditional expectation of A' for given y, and the fun¬ 
ction of x 

E( Y | .v) = f vh(y ! x)dy 


-CO 


the conditional expectation of Y for given x. 

Let us note tha r E(X I v) is the value of the random variable 
E(X | ) ). Since /-.(> | A) and E(X I Y) are random variables, it 
will be meaningful to speak of their expectations. 



Mathematical Expectation 


191 


Theorem 5. E\E(X I Y)\=E(X) 

E[E(Y | X)\=E{Y) 

Proof. By definition, 

! oo 

x g(x I y) dX 

-oo 


■-I 


CO 


-OO 


tell*, dx. 

My) 


where fix, y) and hi)) are the joint probability distribution func¬ 
tions of IX, Y) and the marginal pdf of Y, respectively. 

E{X 1 y)h {y)dy 


Hence E[E(X \ F)]= 


OO 


OO 


-OO 


[i* 


x f(±J) dx \ /,(,,) dy _ 


lay) 


p 

* 


Changing the order of integration we get 

EE(X 1 Y)]= 


-oo 


[ j 30 fix, y)dy^ dX= j ^xg(x)dx~E(X) 


For the discrete case 

E[E{X | Y)]= E[E(X \ y,)\ 


OO OO 

= 2 2 


Xig{X( I ) 


(yj 


00 00 f ( X 4 . y.) , . . 

=2 2 x t J - f -hlyj) 

j -1 <-1 n () t ) 

CO I OO 

= 2 x\2 f(xi,y f ) 

J-l I >-i 

= ? x ( g{xi) = E(X) 

«-i 


Similarly the result [EkY[X)) = ElY) can be proved. 

In ca. e X and Y happen to be independent variables Then 
E(X 1 Y) = E(X) and E(Y 1 X) = EY) 

The graph of E(X 1 y) as a function of y 

x = E(X 1 y) 

is called the regression curve of X and Y, and the graph of E(Y I x) 

>’=£( Y ! x) 

is called the regression curve of Y on X. 

It may happen that either or both of the regression curves are 
in fact straight lines. That is E(Y \ x) may be a linear function of 
x, or E(X | y) may be a linear function of y. In this case we say 
that the regression of the mean of Y on X (say) is linear. 



192 


Mathematical Statistics 


Ex. 1. Find E(Y \ x) and E(X I x) for the density function 

f{x I y) = 2/~, 

where (x, y) belong to a semi-circle of unit radius. 

Hence or otherwise show that p*« =0. 

We notice f(x,y) is pdf since 

f jyo-*’) 2hdxdy 

[The equation of the semi-circle being x 2 -f/« 1] 

=£ 2/77 V(l-* 2 W*=4/n £ v/(l -x‘)dx 

_fV0-* 2 ) 


g (x)= 

I 


2l*dy=2lTT y/(\-x 2 ), -1<x<1 

h <>’>= j^v(i->' 2 ) f(x ’ y) dx ' 

V x 2 +> ;2 =l yields *=±\/U — >’ 2 ) 
=2/r: dy=4lrz V(' — >’ 2 ). 


Therefore 


g (* I >’) 


/Tv. v)_ 2h 

hi)) 

1 


4/ti {\/ (1 v 2 )} 

, , , v f {*•}') - V” 

H (v 1 x) " gx) 2/irVO-**) 

Hencs (r | .v) =£ /(1 ) >' W.v I *) </>' 

Jvu-* 2 > ,_ 1 

J„ ; VO-* 2 ) 

yHVd-* 2 )^^^ 


-— 

_ v2\ - v 


VO“**> 



Similarly 


E{X I >-)= j x g(x | y)dx 

V(l-v ! ) 1 

-Vd-r) x 2 v (i —y 2 ) 


dx 


Mathematical Expectation 


193 


1 [VIVO-/) = „ 
2V(l—V 2 ) L 2 J-v/(l-/) 

. r i / _ _ V 


ZVli-V] L ^ J v ' 1 ’ 

since E{X 1 y)=0, that is, the regression curve of * on Y is constant 
and henoe p xV is zero. 

Ex. 2. lf f{x ,y)~ 6 -=^,0< X <2 ;2<y<4 

= 0, otherwise 

find (a) the marginal probability functions , (6) the conditioned 
probability functions, (e) TO W HE) (d) Coefficient of cor,ela- 

iion. 

Now g<x) = |- j* (6-x->>) (3-x), 0<x<2 

h{y)=\- T (6—x—y) ^x=i (5—>’)» 2<^<4 
° Jo 

* r„ I = 2<v<4, 0<x<2 

)_ g(x) 2(3-x/ ' 

„ (v , = 0<x<2, 2<V<4 

« (x| E)-*(x) 2 {S-y)’ 

E(X) = Y xg (x)(/x=i ( (3x—x 2 ) </x=5/6 

Jo J0 


2^ li 


JO 

f ( y)=t‘ ftyCv)<)> = l[ (5y-y)4)’=i 7 /6 

J 2 J 2 

V(X) = E{X>) EHX)=i\ Z (3-xjx 2 rfx-(J) 2 =1-36“3^ 

r‘ /i7\ 2 

Hn=£(E a )- £2 (E) = l J, (5-y) rfx - 




17\ 2 _H 
36 


£(yy)= H! s; *>■ jx jy 
-r f: (V 2 ')*-™ 

OM*. r)®£(*T)-£(*) e ^ Y)= T~ 


5x17 


1 


36 36 

cov(^ t r) _! 

Coefficient of Correlation r * Y ~ y /{y(X) YiY) H 

5*9. Chebyshev’s inequality and related inequalities. 

Theorems. Let X be a random variable 'j lta 

F(X<pk)>\-llkfor any k> 1 . provided that PiX<U) 



194 


Mathematical Statistics 


Proof, (a) Dfscrete case r 

/ > (A'<0) = U=> 0 

and that all possible values of X viz x u x 2 >-. aV. are >0. 

Hence, 

P=E(X)=Z Xiftxt) 

i 

*</(**) + S .** f( x t) 

xt<i*k Xi^Htc 

> S xtfixi) > 27 /xfc/(x*> 

Xi^Hk 

P(X^^k)= i tk[\-P(X< v .k)} 

j>\-p\x<iik) 

or P(X<nk)> I—i 

k 

(b) Continuous case 

^.*<0) => f° /(*) <fx-0 

J-X 

and f */(*) ^=o 

J-OO 

Now P=E(X) = f°° x f(x) Jx 

J-OO 

= | o */(*)<&> x/"(x) fix ^ ilk /(*)<& 

= Pk P(X^pk)=pk [ 1 -P(jr<|i*)] 

whence P(X<uk)y I — - . 

A: 

This result is known as the generalised form of Bienayme — 
Chebyshev inquality. 

An immediate consequence of this is the following fundament¬ 
al theorem. 

Theorem 7. (Chehyshcv’s inequality). If the random variable 
X has the expectation n and the variance a 2 , then for any c> ], we 
have the inequality. 

P( | X-p I CfTj^l-i. 

c* 

or alternatively P{ | X>P | 

Proof. From theorem 6, we have on considering a random 
variable Y 


Mathematical Expectation 


195 


P(Y<ak)> 1 — Ilk whese E(Y)=a and P(T<0)=0. 

Let us take Y=(X-p)\ so that E(Y)=E(X-tf=°* and all 
the assumptions of theorem 6 are fulfilled. 

Hence, by substituing in 

P(Y<ak) ^ 1-1 Ik. 
the values a=o 2 , k=c 2 , we obtain 

P( | X—H- | <ca)=P((X-ft) 2 cc% 2 ] 

=P[F<Ara] ^ 1-1/A:. 

= 1—1/e 2 . 

The importance of Chebyshev’s Inequality lies in the fact that 
it holds for any probability distribution, provided it has finite 
variance 

Remark. We note from the equation 

P( I X—p I >e<0>l/e*- 

that if KW is small, most of the probability distribution of 
X is “concentrated” near E(X). This fact may he expressed in the 

following way. _ 

Suppose that V(X> = 0 Then vvhere 

For P( 1 X—p I 1/c 2 can be put as 

/>( 1 T-ft | a2 l* when 

Hence />( | A'—/* | >VO = 0 for any f>0 

and consequently 

P( | Jr-/* 1 <v' / ^ = 1 an V f >° 

Since t may be chosen arbitrarily, the result 
P[X=V L ]=\ is established 

5.91 The Law of Large Numbers (Bernoulli’s form) 

We shall see below that as the number of repetitions of an 
experiment increases f A% the relative frequency of some event A, 
converges (in a probability sen^) to the theoretical probability 
P(A). It is this fact which ^H6ws us jo_Jlid.entify” the relate 
frequency of an event, based--offa large number of repetitions, 
with the probabilily^Tthe event. For instance, if an honest coin 
is tossed n &*&&*** of the results of our experi¬ 

ment, say the.frequency ofthe recorded number of heads will 
tend to i.'wbf^iS the expected value of the variable. 

TheorfKti'. 8. L<>t E be any experiment and let the event A e 
associated vllh it as an outcome Consider n independent repetitions 
of e; let n A he the number of times A occurs among the n repetitions 

such that f a =3— ~. 

J n 



196 


Mathematical Statistics 


Let P{A)=P (i which is the same for all repetitions 
Then for every «>0, we have 

P[ \/a~P I >U] < 
or, equivalently. 


nr 


P[ I fA-p I <€}>I- 




en 


Proof. Here is a binomially distributed random variable 
Then E(n A )=np and V (n A )=np[[—p) Now/a=«^//j, and hence 

E(f A )= pan <l V(f A )= p{ '~ p) 


n 


Applying Chebyshev's inequality viz 
P, | X—p I <ra)^l~l/c 2 
to the random variable fA, we obtain 


c 2 = 


/>(!-/>) 


P[\fA~P I <*] ^ 1- 




ne 


•£ 


Lim 

or, equivalently 
Lim 


P[ I fA—p I <«]=! for all €>0. 


n-+cc 


Pr[ \fA-p I >e] = 0 


Now we say that the relative frequency f A “ converges ” to P{A). 
When we say that f A =n A /n converges to P(A) we mean that the 
probability of the event 


{i?- 


P(A) I < € 


! 


can be made arbitrarty close to one by taking n sufficiently 
large. 

Note. The reformulation of 


P[\f A -p\ < e] > l- P -^—^ is 

ne~ 

P[\/a—p\ <«] ^ 1 — 5 whenever say. 

This is Bernoulli's Theorem , which can be stated as follows. 

Let e(>0) and 6 (0<S<1) be two given positive numbers, 
however small, and let n A be the number of the occurrences of the 


iMathematical Expectation 


197 


•event A in n independent trials, in which the constant probability 
■of the event A is p. Then 3 a positive integer m(<?, 8) : 


n>m =» pVP±-P(A) 


O 


]>■- 


Note that ^ does not guarantee any thing about 

\ f A —p |. It only makes it probable that \/a— p 1 will be every 


•small. 

Ex. 1. How many times does one have to toss a perfectly sym¬ 
metrical die in order to he at least 95 percent sure that the relative 
frequency of the outcome 4 five ’ will differ from 1/6 by rot more 

than 00*1 ? 

Sol. Here p= 1/6, l-p = 5/6, €-0*01, and 1-S=*95 


■or 8=005. 

Hence P(\fx- 1/6| <001) >095 

for w ^i6(er(n)Wo5) -27, 778, 

Ex 2. The probability of an item being defective is p (assumed 
’unknown) in a production factory. Let n denote the number of Items 
■inspected. How large should n be so that we may be 99 percent 
sure that the relative frequency of defectives differ from p by less 

than 0 05 ? 

Sol. Using P[ I fA-p I wherover n^\\4<?h 

With the value € = 0 05, 1 -8= 99 or 8 = 0 01, we find 



1 

4(0 05)^(001) 


= 10 , 000 . 


5 29 Another From of the Law of Large Numbers. 

Suppose that X n are identically distributed, independent 

Tandom variable with finite mean and variance. Let E(X { )=t* and 


V(Xi)=<* 2 . Then %=\ln •• -+A») has E {X)—E(\/n 2 Xi) 


1 1 y 2 

£ E .nil —(t and V(X)=^ 2 V (. X t ) . /ia 2 =— . 

Now jf is a function of X t ,..., X„, namely therir arithmetic 
mean, and hence is again a random variable. 

Applying Chebyshevs inequality to the random variable 

Jf, we find 1 X—\t I 



198 


Mathematical Statistics 


Letting ^j- = e gives f= ^ n€ ■ and then 

V n a 


P[ I X—p | as co. 

€ // 

Then we say that the ‘arithmetic mean’ converges to E (X). 
The above Laws of Large Numbers are known as the weak 
Laws of Large Numbers. 

Note. That if A„ stands for the event | nxln—p I we have 
shown that the probability of A n is at most 


4 n €“ 


in view of the 


fact P 


P-P -V. ->0 with 

l_|/J I «€- 4 ne- 


4 ne- 

[the maximum value of p(\—p) being J) 
But the probability of this event for some /;>/« is given by 
P(AmA-A m+ 1 +...) ^ P(A m -\- P(A m +x)-\-... 


U1 

~\m 


I 


4c 

1 


+ 


1 


//l+l 


+ 


"■) 


and, since the series — 4 —-f - diverges, this tell us nothing 

mm x 

about the probability. However, there is a stronger form of the 
law of large numbers, 

Strong Law of Large Numbers. 

Assum X,, X 2 , ... are the outcomes from a sequence of repe¬ 
titions of an experiment. Further assume that Af(X) is a random 
variable for which E{M(X))—P t exists. Now consider the sequence 
of averages. 


MIX >), 


M(X,)+M(Xo) 


» *•* 7 


l 

n 


lA/U' 1 ) + ... + A/(^)j y ... 


and in particular whether it converges to E{\f{X)}. 

Then the strong law says 

-iw^.)+ + W„)]=4 = f 

(.A “"*00 n J 

The strong law imples that the limit of the average approaches 
the common expected value of the above memioned independent 

variables. 

If the mean of a probability distribution does not exist, then 
the sample mean need not have a distribution that becomes ‘con¬ 
centrated’ as the sample size increases. For the Cauchy’s distri- 

c 

bution /(*)*■-—, — oo<x<co, the mean does not exist; 


Mathematical Expectation 


m 


in fact it can be shown that the distribution of the sample mean 
is always the same-it is exactly the Cauchy distribution. 

Theorem 9. With the probability appronching 1 or certainty 

as near as we please, the arithmetic mean of values actually assu* 

med by n stochastic variables will differ from the arithmetic mean 
of their expectations by less tham any given number , however 
small, provided the number of variables can be taken sufficiently 
<large and provided that the condition 

-*-0 as n-+°° 

4l 2 

is fulfilled. 

Proof To fix ideas, assume infinite sequence of probability 
distributions such that for each n we have « mutually independent 
variables AV with the prescribed distributions and assume 

further that the means and variance exist. 

Let ^=E (X t ), aS=V(X t ). Then the sura X=X l +X i -{-...+Xn 
has also finite mean and variance. 

Now E(A') = S + 

<*=*1 

g n =y(X)=£ K(A^)=tfi 2 +°a 2 +*** + CT " 2 ' 

L et U={X x J rXt+-+Xn-P'. 

Then Chebyshev’s lemma (viz. P(X<,*k)> 1-1/A: (see theorem 
7)1 applied to this variable U shows that 

P[U<B n t 2 ) > l-l// 2 where B n =E(U) 

or P[ 1 ^ 1/2 l <y/B* t] ^ 1-1/r 2 

«W- 

- 'J(S )- 

where « is an arbitrary positive number. 


Then P 




< € 


1 . V 


Bn 

2 ’ 


provided that the quotient J‘-0 with . 



200 


Mathematical Statistics 


Worked Examples. 

Ex. I. If the variable x { assumes the value 2*~ 2tovi with pro¬ 
bability 2*, /-= 1,2,... . Examine if the law of large numbers holds 
in this cqsp. [Delhi, M A. (Stat.) 64J 

Sol. Here all the variables x It x 2f x Zt ... are identical i.e, have 
the same probability distribution 

Now E(X)='Z 2~*.2 i ~' l0Vi t summation on /. 

= 22 -*v>ot=2 -!- 

exp. t (2 log i) log 2\ 

1 00 1 

_ r» 

(exp. log i) log 4 7-i i 10 * 

= 1 +_i_+J_ + 

1 + 2* 01 ^ ^ * 

The series on the right side is convergent in view of con¬ 
vergence ol the Zeta-series : 

£ — for /?>! 

np 

This shows that the mean of the random variable exists and 
hence the law of large numbers will hold in this case. 

Ex. 2 (a) Let x t assume two values , / and —i with equal 
probabilities. Thin the law of large numbers cannot be applied 
to variables .r,,x 2 ,x 3 ,... . 

(b) If xt can have only two values with equal probabilities /* 
and — /*, the law of large numbers can be applied to Xi,x 2y .,. r 
if a<l. 

Sol. (a) Here x t : /, —i with P : \ 

Then E(X t ) =/.*-/$=0 

£(AV)=/ 2 .i+i 2 .$=* a 

V(X, =E(X7)-E*(X t , r 

Let X=Xj+Xi + . X n 

Then E(X)=£(Z'X<) = Z E(X<) = 0 

* <~i 

B n =V(X)=>V(Z Xi)=Z K(^)=£ i s 

* t 

6 



Mathematical Expectation 


201 


Applying Chebyshev’s inequality to the random 
X=Xi-\-Xf-\- ...4"An, we obtain 

| X-E( X) 

B n (« + 1) 2n + \) 


variable 




K t*n 2 


Obviously 


n 


6n 


which-*-*** as n-> o o 


Hence law of large numbers does not hold, 
(b) From above, E(Af <)=*/*.$—— 0 


is* |2» 

= '_4-'_= /2t 

2^2 


Hence 


V(Xt) = E(Xi 2 )-EHXi) 

= /2a_ 0 = /2- 

Now A r =A'i+A' a -h.*« + ^ r n =► E(A') = £ £ , (A'<) = 0 
5 n =F(A')==S F(2f<)=27 i 2a , summation on / 

= l 2a + 2 2a + 3 2 *+-.-M 201 
Euler’s Maclaurin’s formula provides : 

f" „ , __/i 23t+1 

B n =FW-j o ^^2a+T 


s|,2,...« 


Now , - 


0 iff 2a—1<0 / e. a<£ 


2a-hi 

This proves what we wished. 

Ex. 3. {X„} is a sequence of mutually independent random 

variables : 

1 — 2 "" 

X n =±\ with probability - 

and X n =±2 n with probability 2~ n ~ l 

Examine whether the weak law of large numbers can be applied to 
the sequence {*„}. [Delhi M A. (Stat ) 65, Dibrugarh MSc 67] 


-n 


4-2 _n_1 4-2 _n_1 = 1 


1—2“», 1-2 
Sol. Total probability = —^— ' 2 

E(Xi) = 1 . L=£-l . , -^l‘+2 + <.2-<-‘-2^.2-‘-‘=0 


£(*,*) = ! 2 . 


2 

1 — 2 _< 


+ (-D 


2 !—?l!-4-(2 +< ) 2 .2- < " 1 


4 -( — 2 "*" x 


== l_2-<4-2 < - 1 + 2 < - , = l-2" < + 2 < 
r(A r <)'=E(A'i 2 j-£ 2 (^) = l-2 < + 2- 



202 


Mathematical Statistics 


~ = X =2 (1 —2~* + 2‘)/n i , 

summation on / = 1,2, 

«=!/#»[«—(l-l/2")+2 (2"-l)]->co as «-*oo. 

2 n 

because Lim /| t = 0 ° by L Hospital’s rule. 

Hence n->oo s>B n ln*-> 0=> the week law of large numbers 
does not hold in this case. 

Ex. 4. {A*}. k= ',2, .. is a sequence of independent random 
variables each taking the value —1,0,1. Given that 

P(X k = I) = 1 Ik = P(X t =-1); Pi X k =0) = 1 - 2 Ik . 

Examine if the law of large numbers holds for this sequence . 

Sol. £‘(A r *) = — 1. l/Ar+0. (I —2/k)+ I. l/fc = 0 

E(Xk z ) = (— 1 ) 3 .1 //c+0* (1 — 2/Ar)-f-1 2 . l/k=2/k. 

P(AT*) = E(X k 2 )-E 2 (X k )=2/k 

~"=2 K(A'*)//i*=l/n 2 £ (2/*) 

**“ *-» *-i 


-JK+h-4] 


Keeping Cauchy’s first limit theorem in mind, we find 
n->co=> 1 /n (1 + £+ i"f~ ••• 4"l/«)=0 
1 i ^ 

Then =0, showing that the law of large numbers 

holds for this sequence. 

Exercises 


1. State and prove Chebyshev’s inequality. 

2. State and prove weak law of large numbers. 

3 (a) Discuss convergence in probability and convergence with 

probability 1. 

(b) Prove that the convergence with probability 1 imp ies 
convergence in probability. 

4. it is known that on the average 2/3 of the seeds of a 
cert iin variety programme germinate. Use Chebyshev's inequality 
to obtain an upper bound on the probability that the number 
germinating w.U differ from the expected number by more than 
10 if 100 seeds are planted. 

5. Let X have the p.d.f.\ 

/(*)“* V 3 ’ v ' 3 < ” < ’ v ' 3 

= 0, otherwise. 

Obtain P[ I * I >y/2\ and compare it with the value obtained 
by Chebyshev’s inequality. 


Mathematical Expectation 


203 


6 Suppose A' has a rectangular distribution on(-l, «■ 
Compare /» > 2 ] a " d ^ 

numbefcan'be applied to the sequence (« where the var.ab.es A„ 
are independent rtf . jrop numbers holds for ihe 

zz — • —» 

,» : n r —»r,:—» 

mutually independent variable „ 
follows ; . 2 n )=2- <2n+,) 

P(X=oT= .1-2- [Bombay 59| 

Hems are produced in suc ^ a 

'SrC slid" K mo,dee ,he, ■ « ““ 

510 Variance for tne linear combination of Random Variable 
L, c/ be The linear combination of a finite number of random 

variables. 

U=a l x i ±a>x 2 + .-■ + <**** . 

, v v x have finite variances 

We assume that the variate* *i, x a , ••» n 

a.*.....*, and the coefficients have fixed values 

We hav4 

=a, £(xi)+o« £<*»> +- + 0 " 

=Ojj*,+a.l‘»+- +<'"**» 

... [/—E(t/)=<»i [x. — l*a — l v " - M "l 
and Var (£/)=£ {[V -E[U)]*} 



204 


Mathemetical Statistics 


= 27 a , 2 E(x t - /x,) 2 4-227 «« £{(x f —/*,) (jr^ - /*,)} 

< " 1 **/ 

=27 o< 2 ff< 2 -f227 aiQjiij 

i -1 



= 27 flj 2 (T < 2 + 2 27 

^ /?y 

Cor 1. Let £7j = l, a 2 = l, a 3 =o 4 = ...= 0 n =O, then 

Var (.v 1 +x 2 ) = Var (x,)+Var (.y 2 ) + 2 Cov (.y,, x 2 ) 

Cor 2, Let a,= l, a a =-l, f7 3 =a 4 = ..= On = 0, then 

Var (x 1 -.v 1 ) = Var (.r,)+Var (Jf*)—2 Cov (*,, .y b ) 

Cor 3. Let ai=a 2 —a 2 —=an~l/n, Then 

(-Vi+.v a -4-...-Kv„ )/n=x, and if further we suppose 
that .Y.’s are independent variates with Var (.Y<)=a 2 for every/, 
then 


T r /- w a <T" (J* 

—n(o s ln*)—o 2 ln. 


[B.A. Hons Delhi 55J 


Ex. 1. In a certain series of n independent experiments, the 
probability of success at the itli experiment is p it i=l. 2, 3,...,n. 
Obtain the expected value and tin variance of the total number of 
successes. 

Sol. We introdence that variable x ( connected with the /th 
trial in such a way that .v< = 1 when the trial results in a success 
and Xi — 0 when it results in failure. 

Let m be the total number of successes. Then 

m = ' x -l-.Y 2 +...+.Y n 
£'( m) = Zfuv,) + E(. Y a )+ ...+ E(xn) 

But E{m) = 1./>< + 0. (I —p i )=p t 

Therefore E(m)=p l +p. £ +.. +/?„. 

Also E(xi-) = l 2 .f», + 0 2 . (I -/>,) = Pi 

<r, 2 = £[*,-£•(*,)]* 

= £(.v< 2 ) - E 2 (x m ) — P ( pf p f q t 
Since x/s are independent variates 

Var(m: = Var{. y ,) + Var(. y 2 ) +... + V ar (x n ) 
=/ , Wi+/ 7 a < 78-b...+^n. 

Since p 4 q t 



A fathematicrl Expectation 


205 


Var(m)=l 7 p t q { < 

If /?,=/? for each /, V y m)=pq=pq +.. n vmes= npq. 

Exercises 

1. An urn contains white and qN black balls, the total 
number of balls begin N. Balls are drawn one by one (with out) 
being returned to the urn) until a certain number n of balls is 
reached. What is dispersion of the number m of white ball 
drawn ? 

2. In lottery containing n numbers (1, 2, 3, ...,n) m numbers 
are drawn at a time. Let x t represent the frequency of a specified 

number / in N drawing. Prove that 

E(x t ) = Np, E(Xi-NP) 2 = Npq 

E{x t -Np) (x)-Np) = Np(p'-p) V=£j) 

m . , ni —1 

where P =—, <7=1—P, P = , 

n n * 


3. Balls are taken one by one of an urn containing a white 
and b black balls until the first white ball is drawn. Show that 
the expectation of the number of black balls preceding in first 


white ball is 



4. The probability of n successes in a series of n independent 
trials with constant probability p of success is nc, p r q n ~ r , where 
0 < r ^ n. Find the mathematical expeciation of I — r—np | . 


5. A population of N distinct elements is sampled with 

replacement. Prove that the expected value and the variance of 
the size of the sample for getting r distinct elements are respectively 

given by 

N [jf + jv=l + '" + 


I I 2 r —’ 1 

(JTTf f (A ^ + (A— r4- 1)"J 

6. In a sequence of Bernoulli trials, let A be the length of the 



206 


Mathematical Statistics 


run of either successes or failures starting with the first trial. Find 
E{X) and Var {X). - 

[(Ans. E{X) —pq = —|) + (-f+^)] 

7. A deck of n numbered cards is put into random order so 
that all arrangements have equal probabilities. If Xk 1 or 
according as card number k is in its natural place or not, prove 

that 

J>(X*=1 )=-!-. P(Xk=o)——~~ 


£<**))—J. Var (Xt)= n —^i . «(***)— 

CoxiX k X t )= n(n [ l j h^k. 

Hence deduce that, if •S' n =.Vi + 2f 3 4- ••• -\-X n denotes the 
number of matches, E[S n ) — 1, Far iS n ) = \» 

8. A lot contains k different articles are drawn randomly 
from the lot with replacement until each article of the lot has been 
drawn at least once. Writing n for the number of drawing required, 
show that 


E(n)—k 


k 


27 



fii. Sc. Lucknow 1167] 


9. A vessel contains tickets numbered 1 to N. Find the 
expectation of the largest number .V drawn in n drawings when 
random sampling with replacement is used. 


, / k\ n ik- l\ n 

Hint. P(X< * 



A 

27 


fc-i 


(k —l) n 



10. Use the relation that E(A\ a +B\'+ Cx')* is always ^ 0, 
x being a random variable and E denotes the mathematical 

expectation, show that 


/J-oa 


f l a4C 

l*a+b 


f»b+ c 





^ 0, being nth 
moment about mean. 



Mathematical Expectation 


207 


Hence or otherwise show that [3 2 —,3i — 1 ^ 0. Show also 
that p. > 1. [M« Sc. Banaras 1967] 

11. The probabiliity of a shot hitting at a point distant r 
from the the centre c of the target follows the law : 


v " if (1+r 2 ) 

The target is divided into four regions as follows : 

S x : within the circle with centre c and radius l/\/3 

5 2 : without Si but within the concentric circle of radius l. 

5 3 : without S 2 but within the concentric circle of radius V3. 
Si : without S 3 . 

The scores for hits in the regions S iy S 2t S 3 and S t are 4, 3, 2, 
and 0 respectively. Let X u X» X 3 and X K be the number of hits 
registred in the region S i9 S 2t S 3 and out of a total of N shots. 
Find 

(i) the probability distribution of Xu X tt X 3% X 4 

(ii) the expected score after 5 shots, and 

(iii) the probability that the score exceeds 15. 

[M. Sc. Poona 1964] 

12. Suppose r balls are drawn one at a time without 
replacement, from a bag containing n white and m black balls. 
Find the expected value and the variance of the number ot black 

balls drawn. 


Hint. [Let X n = 1 if the Arth ball drawn is black, k r. 

= 0 if „ „ is not black. 

***)■«+»• var (Xt)= w) 2 

m[m — 1) / m \ 

n ) 


Cov ( Xt , X k ) + ( m j ( 




— mn 


(m + n)* (w+rt-l) 


Let S=X l +X 2 +... + X r then 


K (S) ——, Var (S)=E Va r (A*)+2 E Cov (X„ X k ) 

c w mX+n* j^k 

mn 


mn 


-2 r c. 


(/n + if ) 3 " t«) l'«t» 

mnr lm + n—r) _ 1 
= ImTn)- (in i-n- lj| 


-1 



208 


Mathematical Statistics 


13. In a lottery m tickets are drawn at a time out of n ticket 
numbered from 1 to n (m ^ n) Then the expectation and variance 
of the random variable S denoting the sum of the numbers of the 

m tickets drawn are m (n + l )/2 and m(n 2 - 1 ) [ 1 ] res - 

12 L n-i ] 

pectively. 

14. A box contains N tickets bearing numbers y 2 , 

Ifand o 2 denote the mean and variance of this finite population 

and S is the sum of the numbers on n tickets drawn at random 
from the box, prove that 

E(S, = and a», 

N — 1 


15. Given a lot of size m containing s items of a specified 
kind. If items are to be drawn without replacement until / of the 

J items have been drawn, show that on the average *i lgL±lj draw- 

•n i 1 

mgs will be necessary. [f. g j 19 ^ 4 ] 

16. A man with n keys wants to open his door and tries the 
keys independently and at random. Find the mean and variance 

of the number of trials (i) if unsuccessful keys are not eliminated 
from further selection, and (ii) if they are. 

J^Ans. n ; // (/i-l); ~- 1 ] 

17. A box contains 2 n tickets among which nc t bear the 
number / (/—0, 1, 2,...,/j). A group of ni tickets is drawn from 
the box, and if the random variable S denotes the sum of the 
numbers on the tickets drawn, show that 

w-7 [>§4] 

[M.A Delhi 1958, M. Sc. Calcutta, 1948J 

5 11. Probability Generating Function 

The probability generating function is defined by 

G(t) = E[i*]=£ t *p (.n) 

«-J 


For example, suppose a card is drawn from a well shufied 
pack and that we score the face value of the card drawn (an ace 
scoring 1 and a picture card 10 ). l he probability of scoring 



Mathematical Exp ec tation 


209 


i, 2, 3,...,9 are each 1/13, while the probability of scoring 16 is 
4/13. Thus 


G(t)= pi (,*+,*+<»+.+<’)+ jf t '» 

The quantity t has no particular significance: it is simply used 
as a carrier for the values c f the random variable. 

Ex. 1. If X is the number scored with the throw of on unbiased 
die, find the probability generating function of X. 

Here p(Xi) = 1/6 for i= 1, 2, 3, 4. 5, 6. 

6 1 

Hence G{t) —S t f />(*<) = T (/ , + f*+». + r B ) 

= -l (since X t = i in this case) 

6 1 — / 

5* 12. The Moment Generating Function 

We shall now define a function that embodies many charact¬ 
eristics of a distribution. 

Definition : The moment generating function of the random 
variable X with probability distribution p(x i )—P(X=x < ), i=l. 2,... 
is defined by 

Mx{t)~E{e'X) = E{{e')X} 


= 27 exp. ( tX))p{xi) 

j-i 


( 1 ) 


which is the mean value of the constant c* raised to the 
power X. 

If X is a containing random variable with p.d.f. /we define 
the moment generating function by 

Mx{t) = | ^ e tx f(x) dx = E{e' X ) (2) 

Mx{t) is the value which the function Mx assumes for the 
(real) variable f. 

N» [.+«+^+-+^+ •••] 

-.+.KO+ 

[2 




210 


Mathematical Statistics 


where 


[ 00 w 

x r f{x)dx or x^p(xf) 

-°o 


▼ 

being the moment of order r about the origin (*= 0 ). For this 
reason the functions Mx{t) is called the moment generating 
function {m.g.f) of the distribution about the value *=0. Simk- 
larly the m.g.f about the value x—a is defined as 


oo 


Mjt-ait) =E (e*x~ a '* t =2 e tm r m>t 



e tx ) > 


= a~" Mx{t ) 


We may note that the m.g.f. is not defined for all values of 
t* for the series (or integral) may not always exist (that is, 
converge to a finite value) for all values of t. However, the m g f. 
always exists for the value / = 0. 

Note. There is another function closely related to the m.g.f. 
which is often used in its place. Jt is called the characteristic 
function and is defined by <f>,(t) = E (<?"*) where 1=^—1, the 
imaginary unit. For theoretical reasons, there is considerable 
advantage in using <f> x (t) instead of Mx[t). (For one thing, <f> x (t) 
always exists for all values of /). 

5 13. Properties of the Moment Generating Function. 

(A) We have 


Mx (/)= 1 + ^ . 

= E{e**) 

Differentiating this r times, we obtain 

—E [x r e xt } 

Then setting r = 0, we have 

d d J,M x (0)=E{X'}=r/ 


. t 


z 


,t f 



This result cun also be obtained by differentiating (1) term by 
term and setting r=0. 3 

tB) The moment function (short form of m g.f.) of *X is 



Mathematical Expectation 


211 


M<xx(t)=E{e**'}'=E{eXi'*} 

' — Afx(at) 

(C) The moment function for X+fi is 

MxMt)=--E {e { x + t»‘)=*E {ext.en 
= 6* E{eXt} = e P‘ M x {t). 

In particular the m ment function for the deviation X—ni of 
X from its mean is 

(D) The moment function of Y=aX+ ft is 

Afy(t) = E(eYf) = E {e<'x+V‘} 

=e* E (e*x<)= e pt E {ex <*'>} 

— e pi Mx (a/). 

In words: To find the m.g.f. of Y=oX-\-fi t evaluate the 
m.g.f. of X at cU (instead of t) and multiply by e fit . 

(E) Effect of change of origin and scale on m.g.f. 

Let y =irH*-r 

Hence here a= 1 / h, p=—a/h. 

Therefore Afy(t) = M x (t/h). 

In particular, ifa=X, h=o x =a say. 

Mr(t) -e“ ( ^ /<7) ' Mx 
X—X 

Recall Y= - is a standard variate. 

°x 

(F) If Afx(t) = Afy(t) for all values of /, then X and Y have 
the same probability distribution. 

The proof is too difficult to be given here This result means 
that if two variables have the same mg.f y then they have the 
same probability distribution That is, the m.g f uniquely deter¬ 
mines the probability distribution of the random variable. 

(G) The moment function behaves very simply when inde¬ 
pendent variables are added. Suppose that X and Y are indepen¬ 
dent random variables. Let Z—X-\-Y. Let Mx(t) t Afy(t) and 
Mz{t) be the m g.f’s of the random variable X t Y and Z, 
respectively. 

Then M z (t) = M x+Y (t) = E {e { x*Y)t} 

~E{eXt e Yt)~E(eX‘) E(Yt) 



212 


M athematical Statistics 


— Mx(t) \fy{t) 

the product of the moment function for X and Y. 

This result may be generalized as follows : If X lt ... 9 X„, are 
independent random variables with m.g.f’s Mx i i=1,2,...,/i, then 

Mz> the m.g.f of 


is given by 


Z — Xi~\~ • •• ~\~Xn, 
Mz[t) = Mx (/)...A/x (f). 

1 fr 


5 12. Illustrative Examples. 

Ex. 1 . The pdf (probability density function) for the random 
variable X uniformly distributed over the interval (a, k ) is given by 

f( x )=-jr —, a<x<b 
o—a 

Sol. Its m.g.f is 


e 1» _, fa 

( b-a)t ‘ 

Ex. 2. The random variable X is binomially distributed with 
parameters n and p when 

P(X=k)= n c k p 1e q*- h , Ar = 0,l,2,...,/j ; q= 1 —p 
Sol. its m.g.f. is 

Mx{t) = S e tk n C)c p k q n ~ k 
*-o 

ft 

= S n c k (pe ( ) k q n ~ * 

* “0 

= [pe , + (l] n 

1 

Recall 27 n c k p l q n ~ k =(p-\-q) m I 

*"° J 

The mean and variance can be obtained by differentiating 

with respect to t. 

M'(t) = n(pe < -\- q) n ~ l pe * 

Arit)=np[e' (n— 1) +tf)"'* p t t+( pe t+ q )n- 1 e t j 

Therefore E(X)= f i l '=M'(0)=np 

£{X 2 ) = p 2 '=M”(0)*=np [(«— 1) p+Il. Hence 
V(X) = M'(( )~[M'(0) t i —n(n — \) p 2 -\-np—n i p 3 
=np(l-p)=npq. 





Mathematical Expectation 


213 


Ex. 3. The random variable X has a Poisson distribution with 
parameter m when 

e -m m k 


P(X=k) 


k ! 


k = 0 , 1 , 2 ,.. °o 


oo 


Sol. Thus Mx{t) = 2 e tk 

fc-i 


e~ n m k 


k ! 


-e- 2 (m ^ 

k~Q k ! 


—m me 1 mie*— 1 ) 

— € e =e 


L 


oc 


Recall Z ^r = 


M' (t)=e m(e ’ ' \ne* 

M"«)=mle mle< - 1) e'+e< e '"'*'" 0 me-} 


X r X 

rr =e 


Therefore 

E(X)=M f (0)=me K V = m 

E(X 2 )=>M\0)=m(\ + m). Hence 
V(X) = M”(0)-[M '(0)] 2 = m+m 2 -m 2 -= m. 

Ex. 4. A continuous random variable X assuming all non - 
negative values is said to have an exponential distribution with par¬ 
ameter <x>0 if its pdf is given by 

f (x) = a x >0 

= 0 , elsewhere 

Its mgf is 

f OO 

Mx(t) « e tx <xe~** dx 

J -30 


a 


i 


r 

t~ *1 


oo x(t— a) 


dx 


[This integral converges only if r<ooJ 


x(t - a) 


1 °°= «_ 
Jo u—t* 


l <Ca. 


Note. Since mgf is simply an expected value of exp {tX) the mgf 
of a function of a random variable can be obtained without first 
obtaining its probability distribution. For example, if X is nor¬ 
mally distributed with mean 0 and standard deviation unity, then 
the m.gf of Y=X 2 is given by 

t Y „ , tX* 1 foe tx 2 -x 2 /2 

'A/*'(.*)=E (<? ) — E ( e ) y/^27r)\-*o e ,e 


dx 



214 


Mathematical Statistics 


°° -i* 2 (l- 2 /) 


[For if X is JV(0, 

l_ r 

v/(27r) J _oo 

=(1—2 /) -1/2 [Recall j“ 

Ex. 5. If X is normally distributed with mean p and variance 
472 > il is symbolically denoted by N (f*, cr) and its pdf is given by 


°° ~ , 
e dx 

CO 


= V >1 

VctJ 


/(*)= 


, —CC<^f<OC. 


®\/ ( 2 w) 

Its mgf is 

M « {) =^k\Z e ' m - 

Let ^r~ =s ’ ^ us and dx=cds. 

Therefore M x (r)= — ~ —f e t { a s+n) —s 2 f2 

V (2rc) J-oo 


^5 


L_ [ +CO --*(«*-2«r/j) 


*"^1 - * 




V ( 2 tc) j _ QO 

1 f+ °° -H(*-“*/) 2 -C 2 /2) 7 

e t/Jf 


-- P 

V(2rc) J.qo 


— e 


*f*H" J 


■1: 

' 2 * 2 _i__ r + 

V(27r) J_ 


+M e -Hs-ot)’ 

00 


ds 


M x (t) 


Let s ot r/ then f/j 1 —*/v and we obtain 

tP+lo-t 2 1 f+OO _ o /7 

fRecall -L-r 

, , L v / (2wj J-co 




• * — / I-* 

The coefficient of f w/• and hence the mean of the giver 

zxse ?™ r "• The variance is ,he «— ~ - 

-■+* 

';!t;r es,nccit ' 9 ,he 


Mathematical Expectation 


215 


Since in the expansion of Mx~r> (f) only even powers off 
occur, the central moment for the normal distribution Nit 1 , a) 2 are 

f*2*+i — 0 2t i 

~ ~k \ * [^he coefficient of {2k fj lS ^ 2i J 


f* 2 i 


or 


2k(2k-\)(2k-2)<2k- 3)...3.2.1 <* 


2 S 


/c(/e — i)(/e — 2)(/c — 3) — 3.2 -1 * 2* 

\2k(2k-2)(2k—4 ) 4.21 [(2*-!)(2*-3)...3.!]<r» 

= k\ 2 b 

= 1.3.5 -.(2fc — l) o 2fc . 

We know that if A'has the normal distribution N(P, a 2 ) then 
y= X ~^ has the standardised normal distribution. The moment 


function of Y is 


Mr(t)=e tl,a ' MxUH 


— e 


-plat ^/<T+ia 2 (//a) 2 j Reca „ ^ 

h t z 


;/*+b 2 


'■] 


Note : We shall have examples of one distribution approxi¬ 
mating another, and we shall find the following theorem useful. 

If as n-* oo the moment function r.f a random variable X„ 
,approaches the moment function of a random variable Y> the dist¬ 
ribution function of X n approaches the distribution function of Y. 
Ex. 6. Let X have a distribution N (P, a 2 ). We find the 

distribution of Y=aX-\-b 

The m g f. of Y is M r“>=e*‘ M (at) 


Now 


•‘'mi 


'<')=** [ 


apt -f ( acr) 2 


r 2 /2j 


(ap + b)t-\-o 2 a 2 t 2 /2 


This last expression is the m. g f. of a normally distribution 
random variable with mean aP- + b and variance flV. This leads 
to the following : 

Theorem : Suppose that X lt ...,X n are independent such that 
each Xi is a N<pi, a \)- Then any liner combination of the X t has 

•a normal distribution. 

Proof. Let Then 


riuui. 



216 


Mathematical Statistics 


E= e 0 ****. . e a *^ nt J 

= E(e aiX '').E(e a *X‘‘} „.E { « a * r "'J 

a =Mx 1 {a i t)Ux i (<V)...Mx ( a n t ) 


-( 


aiPit+afa j 2 / a /2 


M 


anVnt+a n n<y n t 2 /2 


) 


~ -Pt + O'tV 2 r 

- e where Hence Y’rS 

a iV (/*, a 2 ), completing the proof. 

Ex 7. The moment function for the bivariate normal is 

/7v_ v_\/^V+i'pa^of./o + a,*/. 2 \ 

JVM-exp ^ --*-*- + /Vi+fVa 1 

//wn fi nd tfle wans, variances , covariance, and correlation 
coefficient. 

Solution. The bivariate normal distribute bas the density 
function 

1 flu -Ij ) 

where f<- lt p 8 ,o lt o 8 , pare parameters; a u o 8 > 0, —1 p < + l 

The parameters /*„ p 3 are called the means and <?„ a/the standard 

deviations for X x , X 8 respectively; p is called the correlation 
coefficient. 

The joint moment-generating function is given by 

r vs # c » 


r co f co 

A/(/i,/a)«l 

J-OO J-OO 


f -V a ) 


/(• v i. a 2 ) d.Xidx s 


To evaluate the integral, let us note that since 
Xj a — 2 Pm 1 w 2 + uf = (1 — p 5 ) Ml a 4- („ 2 _ P// ,)2 

we may write 


-.(I) 


CT . ^ - J 


in which <£ (w) = 


1 —2 m*. , 

\ / 1 Lit) ,s 1 ie normal denisity function. 


Performing the integration with respect to the variable 
ffio integral in (1), we find 


A'a in 



Mathemetical Expectation 


217 


f*> 1 ./Afi-M hx t tjp g+fft/Ci 

w «. a (i-p s ) 

. e 

exp / 2 2 ^ 2 2 (l — P’) + ^ 2 / x a — h a s/ a i 

CO 


SI ♦ t 3 ?') ™-* 


= exp [^ 2 W(l-P 2 > + ^ 2 -^ «*/«» p /*il 
xexp [f*i(/i + /a os/®i P) +Wfo + f* ca/®i P )2 J 
=exp [Piti + Pit'+h (/i 2 a 1 2 +2po 1 co/ 1 r, + ^ ©*-)] 
The covariance is then 


Cov (Afj, X 2 ) = 


—A/„ V 

dt i 0/ 8 -*2 ^2 


0* — + »/ 

- g iM T/- 1/ 

0fi 0/ 2 -^1^2 


/1 — 0,r 2 —o 

( '”' 2) r,=0, /j=0 


= P ©icr 2 ; 

£ ( AT i) = a y i Af(/i, /=)• 

E(X »)=- 9 ~- W('|. / 2 ) 


/i=0, r 3 =0 ^ 

= ^2 


C 2 ./l = 0. '2 = ° 

V(A r 1 ) = - g 3 / V ex P- {“(^i + ^a^ ^I/jsbO, / 2 = 0" =<Jl 


W 2 )=T7^ exp. {-(/i^i + Z-.^)} W(/i, /a)| r o, / 2 = 0 
(? *2 


= o-> 2 


Then r = P Oiod°iOt— P . .... 

5 15. The use of the moment generating function in obtaining 

varions reproductive laws. 

Reprodructive Properties : If two (or more) independent 
random variables having a certain distribution are added, the 
resulting random variable has a distribution of the same type as 
tot of the summands This property is called a reproduce pro- 

Pert p v . Reproducin’ properly of the normal distribution 

a some X and Y are independent random variables with nor- 
ma l distributions W(f,. °. 2 ) and Mp, «.*>. respectively, 
i 7 = X-\- Y■ Hence 

UA »-e^-n'' 0C + r) '-*' x -"' 

^ E{e l *) E (c lY ) = Mx{f) My(/) 



218 


Mathematical Statistics 


or Mz(/)=exp. (fV+1 aft-) exp. (/v + l a, 2 /*) 

= exp. {(^ 1 + 1 *,)/+* (<?i 2 + a 2 2 )/ 2 } 
which is the mgf of a normally distributed random variable 
with mean and variance af+af, Thus Z is 

oi T^al, 

The above result can be extended to the sum of //-independent 
normal variables, that is, if X lt X 2y ...,X n are n independent random 
variables with distribution N(P ty c< 2 ), /= 1, 2then 

Z—Af,-p ...-f A'n has the distribution N(Z P iy 2 a, 2 ) 

i I 


Ex. 2 . (Reproductive property of the Poisson Distribution) 
Assume A and } are independent random variables having 
a Poisson distribution with parameters m L and m 2 respectively. 

Let Z=X+Y. Then 

Mz(t) = M x (t)My(r) 

= 1) ' e ^ 2 (e t ~l) = e (m t + m 2 )(e‘-l) 

which is the m.g f of a random variable with Poisson distribution 
having perameter m l +m i . Generalising we have that if X { has a 
Poisson distribution with parameter m iy / = ), 2, ..,//, then 

Z—Xi + ...+Xn has a Poisson distribution with parameter : 

tff — ^i -f*... -4- m n . 

i x. 3 . Reproductive property of the chi-square distribution. 

Let Aj, ... , An be independent random variables ha ing the 
distribution 7. 2 ,/„ Let Z-=X x -f- ...-f .V*. Then Z has distribution 
7-n, 2 , w h c re «=//, + ... -J- nt. 

For M.. {t) — (\—2t)~ ,li/2 , /=!, 2 t ... y k 


Xi 


Hence 


which is the w gf of a random variable having X„ 2 distribution. 

Nute lf - v *».are independent random variables each 

having distribution A'(0, 1). Then S=Xf+... + AV has distribu¬ 
tion 7-V 

5 16. Factorial Moment Generating Function. The factorial 

moment function is defined by 

IP (/)~£[(1 + /)*] 

CO x (r) r 

N<nv (1+0*=^ -—j -, where -v<'»=.t(.v- l)(x-2).. (x- r + I) 



Mathematical Expectation 


219 


The rth factorial moment about the origin i.e. p' (r> 
<=coefficient or t r fr ! in the expansi in of W (/) in ascending powers 
of t. 

517. Cumulant Generating Function. 

The cumulant generating function is the natural logarithm of 
the moment function 

/ 2 t T 

C*(0 = log Mx {t) = k l t+kzj-\ + --~i~kr—i +... (1) 

The constants k u k 2 , , are called cumulants and behave 

some what as moments do (in some ways, better) On equating 


coefficients in the expansions of 

exp. {Cx(f)} = Mx(0 


exp 


v t + ~ ) 


• ^ + **• + ^ 
= 1+fi/ 


1 +{*iM-*i (*72 !) + .. +k,{i T /r !) + ...} + (1/2 Dikit 
-4-A: 2 (/ 2 /2 !) + ... + Ar r U r /r !) 4- ...} 2 + (1/3 !) {AV+*a(' 2 /2 !) 
+ ...}»+...-l+f*i , / + M^/2 !) + ...+ f*r'(/ 2 /r !) + .... 
We find : k\=p\ 

k 2 -\- ki i= P' 2 or k 2 —p 2 — (p\Y= 


or 

or 


k% ,, yk,k, 

n +i _ 

k* , k,k 2 
6 2 


_+ *1? = ^ 

213! 3! 

, *1 8 = ^3 
6 6 


k^—p-'z — ?kik 2 — ki 3 

= / V -3 to ' fr ' m ' 2 )-^' 3 

‘=/ i 3 ,— ■3^2 , /A l '4-2^' 1 3 = /X 3 


From (1) it is easy to observe that differcntation and then 
letting f = 0 gives 

k ' = dt' Cx[t) \t~0 

Additive property of Cumulants. Let X\ and X 2 be indepen¬ 
dent. 

Then C (r) = log M (t) = log E le ,Xl .e tX A 

«=log { ( e/A ^ i: )^ E[e tX *^ 



220 


Mathematical Statistics 


C=!og (r)+log 0)=C X ( t)+C x (l) -(2) 

Therefore 

(jr,+x,)/+k, (x,+xj £+...+&, tfi+x,) fj+- 
=a-, [x t )t(x t ) ^|+(r,)<+... 

+k, (X t )j- x +~ 

Equating the coefficient of —, furnishes : 

r ! 

l<r (X x + X 2 ) = k r (XJ + kr ( X t8 ) ...(3) 

Thus the r ,h cumulant of the sum of independent stochastic 
variates is equal to the sum of the r th cumulants of the independent 
variates. 


The m g f. with respect to the mean is 

M x {,] 

Hence on taking logarithm 

C X~ Hl ’ = 't + c x (0 

t- *3 


t x i f + kit+k 2 2~|+^3 j—J + ...+A: r + 


r 2 


. / ° t r 

~h's 2 1+^3 2~J+—J + ... 

This must be identical with 


[since k x =pf\ 


log M 


, (0 = log£p- V 111 


’>'1 

.. “ L* J 

=log£:{l + (X- < ..')f+(Jf-^') 3 | 

/ t 2 t 9 \ 

= log j^l-f H-2 2~j + ^3 J-jH - — ) 

“('*•?■!+ ' 1 * 3~! + -) _i r! + -)+- 

Comparing coefTicients of like powers of t leads to find : 

A'l^i, A'3 = fl 3 , ^4 = ^— — ' ■ ■ p i~ft u 2 

- u !j 4 


A'i ! 2 ^ ^ 3 which is the excess of kurtosfc. 


and then 



Mathematical Expectation 


221 


5*15-1. Effect of change of origin and scale on cumulants 
Let 


> = *{ 


(t)‘ 


} 


and hence My(/) = Mx- a (/ 

h 

— e -(aih)t E ( e ix/h)t = = e -e la ! , " t Mx [till) 

Therefore on taking logarithm 

Cx(t)=-(alh) t+Cx ( t/h) 

or Ar/Z-f k^t j-jj+ —+ k r ' ~, = — j-t+k x £- + Ar 2 2 T/P 

'l-a. K-'l 

r ! /1 




3 ! /l 3 

where k/ and k r are the r th cumulants of Y and X respectively. 
Comparing the coefficients of t we have 

» /_ k 1 fl 1 /_ kr _07 

~ /* ’ Ur —ff * — 

Thus except the first cumulant, all cumulants are independent 
of the change of origin. But the cumulants are not invariant of 
the change of scale as the r lh cumulant of Y is 1 /li r times the r ,h 
cumulant of X. 

Ex. For the binomial distribution 

k\=npq (1 —6pq). 

Sol. The m.g.f. for the binomial distribution with para¬ 
meters n, p is 

Mx(t) = (q+pe t , n 
Thus on taking the logarithm 

Cx(0 = log Mx{t) = n log (q+pe*) 


ki=j- f Cx(t) 


_ npe‘ 


t -0 q+pe l \ t - 0 


= np 


.)] 


H + pe*t J«-o 


nan p 


a 


(q-Vpe 1 )* tm0 

3 


= npq 


k ^di iCx " ) 


4 h ( 


t-o 

1 


q+pe‘ U/YpeT/Jt-o 


r)] 



222 


Mathematical Statistics 


__ T — nqpe* 2 nq'pe* 1 

L q+pe*) 2 (qi-pe^Jt^ 

= 2nq 2 p -nqp—npq {2q-\) = ttpq { q—JZ~ q ) 

~ n pq ( q-p ). 

A: 4 = Cx(t) I [' ~^MP et ■ lnq 2 pe* 1 

^ L-3 dt \_{q- J rpe t )- (q+pt*) 3 J t 


d 

dt 


H( 


0 2 ) 


q-\-pe l (q+pe 1 ) 

~*" 2 ^ [(q+pe*) 2 ~(q-\-pe 
= l\ —"d _ 2 nq 3 “1 

dt [ q-f-pe* + lq-t-pe l )‘'~ (q+pe*) 3 J <1= 

— f n qt ,et 6 nq 2 pe‘ 

L (q+pe‘r (q+pt*) 3 + 

=nqp — 6nq-p -f 6nq 3 p 
= nqp 6nq : p (\—q) 

=nqp (I pqi. 


! r)l. 


(q "f* pc* ) 3 

6nq 3 pe* 1 

(q+f7*y | r . 0 


Exercises 


I- Let A he the outcome when a fair die is tossef. 

(a) Find the m.g f ofA r 

(b) and hence find E {X) and V (X) 

Ans. (a) M x(t)= (e' + eV + e^+e't+t 6 *+*<■*) 

2. If X has distribution X\, using the m. g.f f show that 

E(X)=n and V (X) = 2n, 

3. Suppose thit the cotinuous random variable X has pdf 

/(*)=! ^ _e , -eo < .v < oo 

Obtain the m. g. f. of X and hence find £ (A') and V (X). 


[ 


Hint. 


M x(t) 


= f \ e |A| dx 
J-oo 

= 1 f e‘ (1+ ” dx + \ 

j “ CO 


Since — 



oo 

e ~{\-t)x fa 

-oo 

I x I —x when x < 0 
* — * where .v >0 


Mathematical Expectation 


223 


4. The random variable I of a two parameter expoential 
distribution has the following pdf. 

f(x)-me~ rn{m ~ a) i x ^ a. 

Find the m.g f. of X and use it to find E(X) and V(X). 

[A., cm . =±‘. w,-„4] 

5. Suppose that the m.g f. of a random variable X is of the 
form 

M*(/) = (0.4 e' + 0.6) 8 

Find the m.g.f of the random variable Y=3X+2 and also 
evaluate E (X). [Ans. EX) = 3'2] 

6. If the random variable X has a mgf given by 


M x[t) = 


. obtain the standard deviation of X. 


3 -f -- [Ans. i] 

7. Find the m g f of a random variable which is uniformly 
distributed over (—1, 2) 

[Ans. 1} 

e 3t -] ' 


J^Hint. M*(/)= e tx dx — 


3 te l 




8. Show that for the rectangular distribution 

f(x) = \/2a, —a < a: < a 

the m.g f. about the origin — sinh at. Also show that the 


moments of even order are given by 

_ ° 2n 

IXi ’*~2n-t-l [Kanpur 69] 

9. (a) Define cumulants and obtain the first four cumulants 
in terms of central moments. 

(b) If AT is a variable with zero mean and cumulants k r , 
show that the first two cumulants l x and / 2 of X 2 are given by 
l l =k 3 and l 2 = 2k 2 2 +k*. [Bombay 68] 

10. If A' is a random variate with cumulants k r (r= 1,2,...) 

Find the Cumulants of 

(i) ax (ii) a-\-X\ where a is a constant. 

[Bombay B Sc. 67] 


11. For the binomial diatribution 

($-')• r > 1 

(Delhi M. A. 59] 

12. Express the moment generating function about a point 


224 


Mathematical Statistics 


p in terras of the moment generating function about another 
point k. 

Hint. A) + (/ '“* )]/ j 

= e (A - fc) w<' ) 


13. (a) If p'r is the rth raw moment, the rth central 
moment and k r , the rth cumulant of a certain distribution; 
show how to express /V and k T in terms of /x'< (/—1, 2...., r). 
Obtain on expression for k x interms of /*< (/=!, 2, 3, 4) 

(b) Show that the cumulants except the first are independent 
of origin. 

(c) The m.g.f. of a random variable X is M (/). Show 
that m.g f of .V is [M {t'n)] n where x is the mean of a simple of 
size n. 

14. Define the terms : cumulant generating function and 
characteristic function of a distribution considering both the last 
when the distribution is discrete and continuous. State the connec¬ 
tion is between the two and point out their uses in statistics. 

Show that the cumulants of a Poisson distribution are all 
equal and conversely. 

The rth cumulant of a distribution is 

k r — (r— 1) !, r= 1,2, 3. 

Find the corresponding c. g.f. and cf. What is this distribu¬ 
tion known as ? 

15. Let /V denote the rth moment about the origin. Then 

(j._l J 


Special Discrete Probability 

Distributions 


61. Introduction. 

A sample space is called a discrete sample space if it contains 
a finite or countable number of points. Recall a set is countable 
if its elements (points) can be put into one-one correspondence 
with the positive integers. A probability distribution is called a 
discrete probability distribution if it refers to a discrete sample 
space. Discrete means ‘seperate’ or ‘individually distinct’ and 
distinguishes such a simple space from the continuum where the 
values run together. 

6 2. Binomial Distribution. An important distribution function 
of a distcrete variate is the binomial distribution which may be 
derived in the following manner. Suppose an experiment is per¬ 
formed and A is some event associated with it (which we may 

call suceess’) Suppose that P{A)=p and hence P(A) = \— p- Let us 
consider n independent repetitions of the experiment. The sample 
space will consist of all possible sequences {a., a 2 ,.. y a n } y where 

each ai is either A or A (not A), depending on whether A or A 
occurred on the ith repetition of the experiment. (There are 2" 

such sequences). Further more, we assume that P(A)=p and re¬ 
mains constant for all repetitions Let us choose the random vari¬ 
able X such that it represents the number of times the event A 
occurred in n repetitions. The random variable X is called a 
binomial variable with parameters n and p. The possible values of 
x are 0, 1,2,..., n 

] heorem 1. Let X have a binomial distribution with parameters 
n and p. Then 

Fi X^r) = nc r p r r= 0, 1,2/> + <7== 1 

Proof. The probability that the first r repetitions produce 

A and the remaining n—r produce A is 



22<y 


Mathematical Statistfrj 


p—pq • q=p r < 7 "" r . 

where there are r factors p, and n —r factors, q. Similarly 

the probability for any fixed sequence of r A r s and n—r A ’s is 
18 P r q n ~ r • The number of such sequences is nc„ for we must 
choose exactly r positions (out of n) for the A’s. Since these nc r 
outcomes are all mutually exclusive, the probability that there are 
exactly r events A is nc r p r q n ~ r . 

We check to see that the sum of the values for probability 
function is 1 : 


«• 

27 


r-0 


n 

P(X=r) = 'Z nc r p r q n ‘ r =(p + q) f '=\”=l. 

f-0 


Since the probabilities nc r p r q n r are obtained by expanding 

the binomial expression Ip + q) n f we call this the binomial distri¬ 
bution. 

Note 1. Some times the binomial probability law is denoted by 
b (r\ n, p) = nc r p T q n -r y r=0, 1, 2,.. ,n. 

2. If one single experiment consists of n independent trials 
and there are N such experiments, then the number of experiment 
m which there will be exactly r event A (successes) is N nc v p r q n ~r m 
For N sets of expriments, each of n independent trials, the expected 
frequencies of 0, 1, 2,..., n events A (successes) are given by the 
successive terms in the binomial expansion of N(p + q)*. 

3. The probability of at least r successes in n trials is given by 


P(X^r) = 2 nc x p m q n ~ m 


r-1 

or by P(X^ri= 1-27 n 9 p m q n ~ 9 . 

*-o 

Ex. 1. A perfect cubical die is thrown a large number of 
times n sets of 8. The occurrence of 5 or 6 is called a success. 
In what proportion of the set you expect 3 successes. [Agra 48] 

There are six faces in a cubical die out of which the occurrence 
of 5 or 6 is called a success. Hence 

P(A)=p= J = $, P(A) = \ —p=2/3. 

Here n — 8. The expected frequencies of successes are the 
terms in the expansion of 

N[p-t <7)"=AT(HI) 8 . 



Special Discrete Probability Distributions 


227 


The number of sets in which 3 successes are expected 

56x32 


= N (!)']=#. 

Therefore required percentage 


243x27 


=^2^7 X T= 27 - 3,0/ “ ap ^- 


Ex. 2. If on the overage rain falls on ten days in every thirty 
find the probability (/) that rain will fall on at least three days 
of a given week (//) that first three days of a given week will be 
fine and the remaining wet. 

10 1 

Probability of rain fall / 7 = 3 q ==- 3 

and therefore the probability of the opposite event 

q=\-p = \. 

Here n= 7, r=3. Hence 

P(X> 3) = S V.(W(i) 7 “* 

a-3 


=i- s 7 c* (D* (i y- x 

a-0 


wet 


-'HBl’+M-! H 

—-il 

='-f-4 [4+l4+211 

416 313 

“ I 29” 729 • 

(li) probability that first three days are fine and remaining 

ay I iy 8 

= \ “2187- 


Ex. 3. If two symmetical binomial distributions of degree n 
(the same number of observations) are so superposed that the rth 

term of the one coincides with the (r-fl)//i term of the other , 
ihen the distribution formed by adding superposed terms is a symme¬ 
trical binomial distribution of degree (n + 1). 



228 


Mathematical Statistics 


The binomial distribution is symmetical when p=q=\. Let 
N be the total number of observations. Then the rth term of the 
first symmetrical binomial distribution of degree n is given by 

Br-l = N nCr_ X (\) n ( 1 ) 

and the (r+l)th term of the other symmetrical binomial 
distribution of degree n is given by 

B'r — N. nCr-x (i)» (2) 

After superposition we have 

B\=B r - l + B' r =N (*)" ["Cr^+ncr) 

= N.(h) n « + 1 

Cr 

= //(*) n+1 n + l 

Cr 

which is the (r-fl)th term of a symmetrical binomial distri¬ 
bution of degree ;i-f 1 . Note, have 2N is the total number of 
observations. 

Ex. 4. X is a random variable distributed according to the 
binomial law ; 

p ix=x}=b (x; n, P) = nc » p*q"-\ t = 0, I, 2 . n. 

Obtain the recurrence formula 

b{x+ \ t n,p )= n —L b( Xt p ) 

[Delhi B Sc. (Hons) 67] 

b (.v-f- I; n, p)~nc x +\ p x+l q»~*-i 

^n(n-l). (n -x+J )_(n-r) , 

U+1)J p q 


Now 


=(m r 


(x) ! p ‘ 


n —x n 

—rr • ~ P* Q n ~ x 

x + I q 1 v 

n — v p 

~ r~T . — bfx: n, p) 
x +-1 <7 ' ’ * y ' 


Note. This recurrence formula may be conveniently denoted by 

f = Y _|_ i q~ f (x). This is very useful in fitting a binomial 
distribution to given set of data. 

Ex. 5. If in a Bernoulli sequence of n trials, it is known that 
there are exactly r successes, what is the conditional probability of 
a success at the ith trial. 



Special Discrete Probability Distributions 


229 


Let the event with n trials and r successes, be denoted by A n , r 
and let the event with n-~l trials and r —1 successes be denoted 
by B n _ i, r -i. Suppose E t denotes success at the ith trial. 

We have to find P(E, | A„, 

r) 

P{Ej Q Bn-i, r -l ) 

P (An r) 

_ P(E{) D P(B n _ ly r-i ) 

P (An, r) 

p . n_1 c>_i pT-'q n ~ r «-V r _! r 
= nc r p r q n ~ r ~n n-i r = ~n 

r• Cr - j 

which is independent of p and is the same for each i. 

6*3. Mode of the Binomial distribution : To find the most 
probable number of successes in a series of n independent trials', the 
probabilty of success in each trial being p. 

Equivalently 3 a real number p : 0 < p < 1 Then the probabi¬ 
lity P (i n , r)*=ncrp r q n ~ r , for each positive integer n is greatest for that 
integer value r m of r which fulfils the inqualities 

p (n- fl)-l < r m <p («+l) ; 

p [n + 1) is an integer, then P {n, r m —\) = P (n, r m ) and both are 
integer then P(n— 1) for any other value of v. 


Proof. Let we B r =nc r p r q n ~ T The mode is that value of r for 
which B, is greatest. 

B r is greatest => Z? r _, < B v > D r +i 


Now i ^ B r => 

n (n— 1).. (n—r+2) (n—r+ 1) 


n(n l)...(fl r-\- 2) n r-l Q n-r+i 

(r-1)1 P 7 




r ! 


P r q n ~r 


or 


q < - r+ — pj.e. qr < p («— r-j-1) 


Then r< p (n+1) since p+q=\. 

Again Br > B r +i 

nc r [fq n ~ r > nc, +i p r+1 q n ’ r ~ 1 q > P 

=> q(r+\)>(n—r)p => r^np-q 
r > p (n+\) — 1 since <? = 1 — p 
Then p (n+ 1)—1 < r™ < P (n+ 1). 



230 Mathematical Statistics 

If p{n- f-1) is not an integer then r is the greatest integer io 
P [n+ 1). 

If p(n 4- 1) is an integer, r can be both 

If n is in creased, the mode tends to np, 
since p—Qln^rjn^p+p/n. 

Ex. 1. Find the most probable number of heads in 99 tossings 
of a biased coin , given that the probability of a head in a single 
tossing is 3/5. {Punjab M.A. 59, Delhi M A. 61J 

Here n=99, p=3/5. Then the inequality : 

P{n+ ) —1 <r m <p(/j + l) 
gives 3/5 (99+i)-l<r m <3/5 (9^+1) 
i.e. 59<r m < 60 . Hence r m = 60. 

Ex. 2. Let np be a whole number. Then the mean of the 
binomial coincides with the greatest term. 

Ex. 3. For a binomial variable x, prove that 

where F(x) is the probability function of x. Deduce that P{x) 
increases with x so long as x<np—q. 

6'4. Four moments and p it p 2 , y l and y 2 of the binomial distri¬ 
bution. 

The probability of r successes in a series of n independent 
trials is P(X=r)= n c T p r q n ~ r . 

Hence m th moment about the origin is 

n 

p'tn=2 r m n c r p r q n ~ r since M' m =l /N 2/,*/», E f=N 

r-0 i 

=£ (f IN) xr=i,Pixr 

* * 

Then by Ex. 2, (P 182), 

Pi'—np t fi 2 '=n(n — 1 )p 2 -\-np t p i =npq=a i 

Take the arbitrary origin at 0 successes. 

Now /*/ = 2 r 3 . n c r p r q n ~ r t summation on r=0,l,2,...,oo 

E r 2 .* 1 c T -ip r ~ 1 q n ~ r 

— np 2 ( )2 n -l c - ip r-l qn -r 

=np E [(r-1) 2 +2 (r—1)+- 1 ] "-'cr-ip'-'q"-' 

But E [r — 1)- n ~ 1 c r _ip r 1 q n ~f=p. i ' G f distribution (p + q) n ~ l 

°(n— 1) (n — 2)p 2 +(n—l)p 
E (r — 1) n l c r -ip r - ] q n ~ r = pf of distribution (p-\-q) n ~ l 

= (n — l)p 


Special Discrete Probability Distributions 




*nd 2 n " 1 f T _,// r -V r =(#+/>)" 1=1 

r—t) 


and 


Therefore 

K 

% 


Pi 


-rip [(«—!) [n-2)p-+3 (« —l)/> + H 
n{n— 1) ( n-2)p z + 3n ( n-\)p 2 + np 
= A t 3 '“3f* a , pi / +2^/ 3 gives 
= w/> (2p*-3p+l) = np (1 -p) (1 2/?) 

=npq {q—p). 

—S r A *CrP r q n ~ r ="P % r\"-'c r - lP r l q* r 


r- 0 


r-0 


=np Z (r— 1H-1 ) 3 n_1 <V_iP 

r 

-*/>2;i(r-l) 3 +3<r-l) 2 

+ 3 (r—l) + l]""Vf_,p r “V r 

The above expression contains terms /V, 3** a ', 3pi of the 

•distribution (tf+zO" -1 anc * 1- 

Hence fV=np [(«— 1 )(«— 2 ){n—3)/> 3 + 3(n-1) («—2>p 

+ /„_l)p+3(n— l)(n-2)p 2 +3(n-l)p+3(n—l)p+l] 

=n(n-lK«-2;(«-3)j>* + 6n(»-l)(n-2)p“ 

-j-7n(n-l)/> 2 +'>/>• 

Now =p«’—4f*j'Pi'+^ i’ 4 gives : 

# I -WA , +W(>-«M)' after slight simplification. 

Now „ Pg ‘ Yl ^ vn V"Pq 

Thus skewness is positive ifp<l, negative if p>\ and zero 

i f ^=i- 

_^_3+LlM ■ ya= 0,_3=l^2. 

-p,*~ + npq ' "P9 

We observe that as n the number of trials increases indefinitely 
S.-+V s 2 -»3 and yi and y s both in the limit are zero. We may 
note here that for the normal distribution ft=0. ft,-'3 and so we 
nan infer that as n the number of trials increases indefimtely, the 
binomial distribution tends to the normal drstcibution. 

Remark. The moments of the binomial distribution can be 
most conveniently obtained by finding the factorial momen's. 


.r-l/an-r 


Also p 



232 


Mathematical Statistics 


We know — 


*<•> 


Cx-i 


Now fi\,)=E(X {9) )=£ x lt) n c m p*q % -■ 

*-o 

where the binomial variable * is considered in place of r. 
or 


/ i V,=27 n (, > ft -V r _, 
£—0 




p'q 


ii 

—n l,) p 9 2 7 n ~ a c x -,p x -’ 


•) 


n~s 


=« ( *> />* ( 9 +/>)'’•’-=/J c, > /?', Since *+/>=! 

Now E(X)=pi=np 

t x 2=E(x 2 )=E[x(x-l)+x]=E[xU>+x]=E[x i »]+E(x) 

=/i< 2 > p 2 -f-/</?=//(n = 

,x' 3 =F(a: 3 ) = £[*(*- l)(.v-2) + 3jt(.v- l)-f a] 

= £[*< 3 > + 3*< 2 >+*] = /»«>/>»+ 3 n^'p'+np 

/V~£(* 4 ) ==£!*(.*- 1)(*-2)(*-3) + 6,y(*-1;(.*-2) 

+ 7*(*-l)-f-*] 

= £'[.) C »>4-6A( 3 >-f7A‘2)+.x]=«(^^-f6;/‘V-b7n (2 >/7 2 +^. 

The central moments can be determined as above. 

Theorem 2. For z//e binomial distribution, the mth moment 
about the arbitrary origin of zero success is given by 

/im= ( %) {q + P) * 
where we put q+p = \ after differentiation. 

Proof. Clearly 

( p Fp) pT=rpT { ^)V=rV. 

( %) m p r ^ pr - 

Now fi ' m =S r<”nc r p r q"-r , summation on r= 0, 1,2,..., oo. 

- s (i)>- 






Verification : 


"(4)"’^ »"' r J=( %)"(?+p)”- 




233 


Special Discrete Probability Distributions 


*'-( %) 2(q+p) ’' = 
=( <%)[ np(q+pY ~ 


( p h) ( p lp) (q+p]n 

+ {n-\)np(q+pY ~ 2 j 


= np + n(n — 1) p 2 . . * 

6 5. Recurrence relation for the moments of the binomial 

distribution. 

Theorem 3- For the binomial distribution {q-VpY 

(*r+i =pi[ nr 

where \i r is the rth moment about the mean. 

Proof We take the binomial variable as X. 

P(X=x) = nc x p x q n ~ x 

Infact E(X) = ni=np => 


u r =E(X-npY = i (x-npY ne x p*q " x 

XctQ 


ii r =£ nc x [(x—np) r {p 9 .(n—x)q n -'- l . — \ + xP n V x ) 
dp 

-\-p*q n ~ x r{x—np ) r ~ l . — n] 

=£ nc x [(x-np) r p*-'q n - a ~ 1 {-("-x)p+xq} 

B-0 

— (x— npf~ l p x q n ~ x nr] 


£ ncx\(x—npY p*-' (x-np)-(x—npY-'p m r-‘ nr] 


JT -0 


Hence [(.x-np)'* 1 nc. p’q"-’ 


— (x — np) r ~ l nc w p x q n -*nr pq] 


pf+i—pr-i nr pq 


Therefore P r +,=P‘l[nr Pr -completing the proof 


Particular case of theorem. 


Putting r=l give Pi^pq^np o+^)' sincc f ^ -0 

ie. P z =npq 



234 


Mathematical Statistics 


Letting r =2 gives : 2 n 

^P9j- p ( w )= "pv^p -p 2 ) 

= npq [\-2p)=npq{q —p ). 

Placing r = 3 provdes : 

dj* 3 

dp 




=P7 


= 3/ry>V +pqj^np{\ -p)( 1 -2/?)] 

=3 n*p*q*+npq ~[p-3p 2 -\-2p i )] 

=3« W+ npq{ 1 — 6//+6/> 2 ) 

■= 3 w 2 p V+///?(?[ 1 — 6p( 1 —/>) ] 

= 3n-p 2 q 2 -f n/?<y( 1 — 6/?#). 

Exercises 

1. Show that a measure of the skewness of the binomial 
distribution is given by (? — p)IV(' l P ( l) and its kurtosis is 

3 + (l —6pq)/(npq), 

2. If x is a binomial variate with the probability law 

P{x)= n c x p*q n ~ m t q= 1 —p, x=0, 1, 2,..., n prove that 

Cov )=^i. 

\ti n J n 


[Delhi M.A. (stat) 59] 


[wot. £ ( )“ r M = ^.np=p 


[Cov (x t y) = E(xy) — E(x)E(y)] 

E{X) -^ E M 2 -P+P 2 
b 7^- w 7 + n V] -p+p 2 = —^ J • 


Special Discrete Probability Distributions 


235 


3. A random sample of size n is drawn from an infinite 
population in which the individuals fall into one or the other 
of the two categories with probability p and \—p. The numbers 
in the sample falling into the two categories are and n 2 

Show that the covariance between ^ and« 2 is 

(1— p). Hence obtain the variance of the difference in the pro- 

portion — — — 
r n n 

[Hint. Clearly n x is a binomial variate with parameters n 
and p. 

E(ni) = np, E(n t )=E{n-n 1 )=n-E(n i )=n-np=nq 
Cov {n lt n 2 y=E{n l n i )-E(n l )E{n i ) = E{n l n-n l i )—n 1 pq 

Since — n 

= n E(n x )-EM-n'pq 

^n'p-npq-nY-^pq =" 2 />( 1 -q)—n 2 p 2 -"pq = ~"pq 


n 


=^-[ K(«,) —2 cov («,, n 2 ) + V(",)] 

Since V(n x )=E[n i —E(n j)] 2 

-[npq + 2 npq +npq)= 4 -^ 1. 

6 6. Moment Generating Function of the binomial distri¬ 
bution. 

The Moment generating function about the mean is 

X-np)\\-npt 


Mx- np (t) = E £ e 


J=*e ’ ,y, EM 

= e~~ npt Mx{t)=e npt {q+pt*t n 


T —pt . 

= 1 qe +pe 


Now Mx-np(0 


-M- 


»£i_ 


Pt+P 


p 3 / 3 


2 ! 3 ! 


+* ••• 


) 


/ r 2 r 3 \l n 

+ p [ i+?<+<?* 2 ! +q 3 j-j+.-J 


[ ,2 / 3 1" 

l +p‘> 2 \ +pq T\ iq ‘~ p,)+ J 


• •• 


236 


Mathematical Statistics 


==1 + n [pv^+pqiq*-?) f]+-) 

+ "*2 7' 1 '' [m j-i+pqW-p 2 ) ji + -) + 

t 2 / 3 ft 

=sl+ ' i *2i +/A3 3i + ^ 4 n + - 

Comparision of the coefficients of and ~ gives 

^ 1 3 ! 

^2 -npq, p 3 = (7* - p 2 ) = n/>7(<? 2 -p 2 ) = npq{q-p) 

Thus fl 1= e»! = <iz£l =L ‘~ 2 /’)* 


• • • 


npq 


and y 1 =v / Pi=-J 77 -?^ T . 

V(npq) 


Hence the skewness is positive for p<\, negative for p>\ and 
zero for /? = £. 

6 7 The Cumulative generating function and cumulants of the 
binomial distribution. 

The cumulant generating function of the binomial variate is 
given by 

Cx(/) = Iog Mx{t)~\og (q+pe*) n 

= n log ( q+pe*) 

that is, 

/ 2 1 2 

hit-\-k» ••• +A> —j + ...=// log (q+pe*) 


CO 


or 


/= 


27 k r —} —n log (q+pe*), q=\—p 

r m 1 » 


Differentiating this w.r.t. p and t, we obtain 
dk r tj_ _n (e‘— 1) 
d!i r ! q+pt* 

Now 


g “'>r yc , , . £ j. I* _ n P e ‘ 


r ! q+pe* 


v nn d krt r _npq {•>* - 1) T 

Spq d P rr-j+p* LW^ +1 ]- 

_ npe‘ t r 

q-Ype* n P- 2k r+i ~i~ n P 

Whence equating the coefficient of — 

r ! 

/- dk r ^ , 

kr +x =pq —, r>l. 
dp 


np 



Special Discrete Probability Distributions 


237 


Since k i =p 1 '=np, k« = npq, k 3 =pq^=npq ( q-p ) 

k i=P q jj? =n P ( l (l—6pq). 

Exercises 

1. Find the m.g f. of the binomial distribution about its 
mean, and hence write the expression for the first three moments 
about the mean. What can you say about the skewness of the 
binomial distribution ? 

2. Find the condition under which the sum of two indepen¬ 
dent binomial variates of parameters (»j,/>,) and (« 2 , is a 
binomial variate. 

[Hint. Let X\ t X>> be the independent binomial variates with 
parameters (M|, p x ) and (w 2 . Pi) respectively. Then 


M Y {t)=(q l +p l e‘) n \ M <t) = (q,+p 2 e‘) 

A \ A o 

Let Z=Xi+X 2 

= M^{t) M ^(/) —(tfi+The*)” 1 (<h+P'it') 

It will be the m.g f. of a binomial variate if and only if 

< 7 i =</ 2 . Pi —Pz 

so that M’z(t)=(q l +p l e i ) ni ^ fl2 

which is the m g f. of a binomial variate with parameters ' ,-f//* 
and p x . 

3. Show that for the binomial distribution 

k ’"~ p “T P ' r>l 

Hence deduce that npq (1 ~6pq) [Delhi M.A. (Slat.) 59] 

4. Obtain the m.g.f of the standard binomial variate (x—np)/ 

V i n pq)* find the limit of this function as //->co and interpret the 
result if you can. [Delhi B A. (Hun’s) 1 >54J 

[Hint. Mx(t) = (q+pc') n 

f -"P* , _ tX 

M X-np =E\e \(^/npq)j =e y\npq) £l <? v (npq) ) 


n 


//o 


V( n pq) 


— npt 

V("pq) 

e M x 

— nnt 

e W»pq I 




q+p t* V 

- pt qt 


< n PQ)\ 


I \n 

= [qc V("P ( l) +pe n/'W)) 



238 


Mathematical Statistics 


M- 


Pt 


4- 


p*t 


\,/{npq)^ {"P<l) 2 •' ) 

+p(l+ ^ 


2/2 


+ 


ft 


W n pq) \*pq ) 21 


...)i 


_ [ 1+ k + -] 


t*/2 


When n -> o o the right hand expression tends to e 
which is the m. g y f of the normal variate N (0, 1) 

Ex. 5. Shaw that the facorial moment generating function <*>(0 

of the binomial distribution ( q +p) n is (1 +pt)*. Hence or otherwise 

show that a', = //, . p r 

r ( r) (r) 

[Hint.. aj(f) = £[(l+/) r ] = S "r. P* q n ~ x (1 + 0 Z 

Z n c m {p(\ + t)) a r/ n ~*={< 7 +/>( l+')}" = ( 1 +/>')" 

Coeffi. of in the expansion of co(f) fives P' =n ir) p r 

r ! 


6 8. Limiting form of the Binomial Distribution 

Case 1. p = h 

A coin is tossed 2n times where n is large. To find the pro 
babilitv ofn + r heads and therefore ofn—r tails. 

Let P be the probability 
/>=2"r n+r (i)" +r (t) n " r 
(2'/)! 1 


“(« + r)!tw-rM 2 n 
Then, using Stirling's formula for n ! viz 

n \ ~ a “e V( 2tt ) 

(the limiting value of the ratio of these two expressions, as n tends 
to infinity, being unity) we have. 



x 2 n+h —In x 
(2/;) - e v(2r.) 


u „ + r) ”+'+L-»-V2,)l(»-r/’- r+ L- 


n+f 


y/{2 n) 


2 2 " 


= \/ (w/i) ( * 
Hence log 


'J 


—w — r+i 




-n + r- 


-(/i+r + J) log (l+r/w)-(w-r+J) 

log (1— r/w) 


For r < m, log(/V(*") 



Special Discrete Probability Distribution 


239 


-(«+r+l)(-£— £3+... )+(»-'-+i)(f+ 2 ir s +.. ) 


-- +0 

n 


U) 


Therefore, P= 


1 


V(* T,/ 


— r 2 /n 

e approx 


Thus probability of n-\-r heads, a discrepancy of r from the 


mean in 2 n trials 


1 -r 2 /n 


v/M) 

n 


Since <r 2 =2«.£.$ =—, 

P== _L_ «-'*/*' 

cr \/ ( 2 tt) 

Probability of no discrepency (there is, of n heads and n trials 

1 


i e. r— 0) is 


which tends to zero as n tends to infinity. 


V (wM) 

in fact the probability for every n + r tends to zero. 

To find the continuous distribution function in the standard 
form we put r = xa — x v '(n/2) 

r+\ = (x + 8x) y/(n/2) 

i.e. 8x = \/(2/n). [ry/(2!n)=x, corresponding to unit 

increment in r we have the increment y/(2/n) in x which may be 
denoted by S x when n -► cel 

Now the probability of the number of heads being ^ r and 

, . 1 e -r 2 ! n 

< r+1 ,s vM 

Therefore the probability of this number bytng between 
n+x\/(n/2) and n+{x-\-$x)y/(n/2) is 


1 


Vi2tt) 


-x 2 /2 
e 8x 


[ r 2 x 2 

Since — = — and 
n 2 


- = -l 

n y/2\ 


The probability of heads lying between n—s and n+s out of 
2n trials 

‘W(2/n) J_ x 2 /2 dx= __2_ 

—s^/(2jn) y/(2n) V( 2 *))o 


-f 




Exercises 

1. A coin is tossed N times where N is large. Find the 
probability of the number of heads lying between 



240 


Mathematical Statistics 


\N-WN and \N+WN. 


[Hint. Probability = 


2 —x 2 /2 


dx J 


2. If a coin is tossed N times where N is very large even 
number, show that the probability of getting exactly x) heads 
and (AiV-fx)) tails is approx. 


u-r 


-2 x z /N 


[Agra M. Sc. 58] 

[In the article discussed above take 2 n = N i.e. n=N/2 and 
replace r by x]. 

6 81 Limiting form of the binomial distribution. 

Case II. General Case. 

For n trials, the mean number of successes in np, where p is 
the probability of success in a single trial and the probiality of 
np-\~x successes 

P- nt *lp + . X p n » + * qn-{np+x) 


n ! p n ? +T n 


(np + x) ! ( tu/-.x) ! 

Using Strilings approximation 

+ P ni * x (l nQ - 


since n—(np+x)=ttq—x 


P= 


+ i - V + i fr»« + V(2*) 


[{np + x) ' ~e 

1 ( . . x \-np—x-\ 


(■+.?) 


(■- 4 ) 


x \— nq+x— £ 


~V’( 2nitpq) \ np 
Taking logarithms of both sides we may write, provided | x 
is less than the smaller of the two quantities up and nq, 

log(P v '(2r.»p</)= (-iip-x- i)[ ~ +...) 


+ </> 9 —. x +i>(^+2^i +••) 


-X 

2 >: 


1 

<7 


P1 2 n \ 


1 I 

p + <7 


Mi) 


_i-’+of ‘ I 

2 n\q p J Wpq W r 

Introducing a new variable, z^xjy'n, we may write the 


above 


(> -r)-2?r +0 ^) 




Special Discrete Probability Distribution 


241 


This series is convergent so long as I z I is less than the 

simaller of quantities py/n and qV n • Now let n tend t0 lnfinity * 
Then provided that neither p nor q is very small, all the terms of 

above series tend to zero except —z 2 J2pq t so that, when n tend to 
infinity, we have 

log P-yilrznpq) = 


or 


i -*l2pq 

VV” n Pq) e 


approx 


Corresponding to unit increment in x we have the increment 
lIn in z, which may be denoted by dz when n tends to infinity. 
And, if we write dp for the limiting value of P, the above formula 

for P becomes 


dP 


y/{2~pq) 




giving the probability of z falling in the interval dz. 
the required continuous distribution of z. 


and we 


have 


Since V(x) 

= (l/n)-npq=pq. 

Now let < 7 2 —V(Z)=pq, then the above probability becomes 


1 


e ~*l 2 °dz 


dP try/{ 2n) 

which is the form of the normal distribution N (0, a-). 


Exercises 


1 Fitiiig Binomial Distribution. The following data are the 
number of seeds germinating out of 10 on damp filter for 80 sets 

Fit a binomial distribution to these data: 


x : 0 l 2 3 
/ : 6 20 28 12 

[ Here /» = 10, 
Arithmetic mean 


4 

8 


5 

6 


6 

0 


7 

0 


8 9 10 Total 
0 0 0 80 

[Agro 551 

N= 80, and total frequency E/i = 80 

EfiXt 1 x 20 + 2 x 28 + 3 x 12 + 4x8 4-5 x6_ 

Efi ~ 10 

174 


80 



242 


Mathematical Statistics 


But mean = w/7; hence np = 17*1/8?). 
therefore /^= 174/800=0*2175, q=\ -^=*7825 

Hence the binomial distribution to be fitted and the data is 

80 (* 7825 -f - *2175) 10 

From this experssion we obtain the expected frequencies of 
0, 1, 2,...,10 successes. These are approximately 

6-9, 19-1. 24*0, 17-8, 8 6, 2.9, 0 7, 0*1,0,0,0]. 

2. The distribution of headless matches per box of 50 in a 
total of 100 boxes is given in the following table : 


No. of headless matches 
per box : x 

No of boxes . / 


0 1 2 3 4 5 6 7 

12“ 27 29 19 8 4 10 


Totaf 

100 


Assuming the distribution of headless matches for box over 
100 boxes is binomial, fit a binomial curve to the above data. 
Compare the variance of the actual and observed distribution. 

[Hint. ueren=5(), 2V=100; 


x = up = provides p — -^= *04 

Expected frequencies are 3iven by N. n c» p m q n ~ x . 
of the recussion formula. 

will prove helpful. /(0) - N q n = 100 (•9a) 50 = | 303 
The theretrical frequencies to the nearest integer are 

13, 27 28 18 9 3 1 0 

Observed variance -= \/n 'Ef i x 2 i —x- (^=2) 

— 1'78 


The use 


Theoretical variance -npq =±50 ('04) ('96) = 1'92] 

3. In 103 litters of 4 females were noted as under j 


No. of female mice : 0 1 2 3 4 Total 

No. of litters : 8 32 34 24 5 103 

(a) If the chance of obtaining a female in a single trial is 
supposed constant, estimate ih-s constant but unknown proba¬ 
bility. 

<b) rf the sire of the litter (4) had not been given, how 
could it be estimated from the data ? 



Special Discrete Probability Distribution 


243 


(c) How could the assumption that the chance of obtaining 
a female in a single trial is constant be tested ? 

[p = 0'466] 

4 Eight coins are tossed at a time, 256 times. Number of 
heads observed at each throw is recorded and (he results are given 
below Find the expected frequencies. What are the theoretical 
values of the mean and standard deviation ? Calculate also the 
mean and 5 . d. of the observed frequencies,. 

- * 4 r /* 


No. of heads at a throw 


0 1 


7 8 


j 0 30 52 6? 56 32 10 1 

Frequency I P 

5. The following data due to weldon shows the results ot 

throwing 12 dice 4096 times; a throw of 4, 5 or 6 being called a 
success (x) „ _ . 

x: 0 1 2 3 4 5 6 7 8 9 10 11 12 Total 

f : 7 60 1 8 430 731 948 847 536 257 "1 H - 4096 

Fit the binomial distribution and calculate the expected treq- 
uencies. Compare the actual mean and s d. with those of the 

expected ones for the distribution. 

6 Assuming that half the population are consumers of 

chocolate, so that the chance of an individual being a consumer 
is i, and assumina that 100 investigators each take 10 individuals 
to see whether they are consumers, how many investigators would 
you expect to report that three people or less were consumers ? 
[Hint. lOOKil'o+’V.li) 0 + , %(il , “+ ,0 e 3 '|) ,0 l = 17 nearly] 

7. Bring out the fallacy, if any, in the following statement : 

The mean of a binomial distribution is 5 at d its standard devi- 


at ion is 3. 

8. An iircguler Six faced die is thrown and the expectation 
that in 10 throws it will give five even numbers is twice the expect¬ 
ation that it will give four even umbers. How many tunes in 
10,( 00 sets of 10 throws would you expect it to give no even 
number ? 

[Hint. 10,000(3/K) W =1 appro,]. 

9 Six dice are thrown 72^ times. How many times do you 
expect at least three dice to show a 5 or 6 ? 

[Ans 233] 

10. In a precision bombing attack there is a 50/ 0 chance 
that any one bomb will strike the target. Two direct hits arc 
required to destroy the taiget completely. 


24 i 


Mathematical Statistics 


How many bombs must be dropped to give a 99% chance 
or better of completely destroying the target ? 

[Hint. Out of n at least 2 bombs must hit the target for 
destroying it completely. Hence 

P(2) + P(3) + ...+P(n)^0S9, 

where />(*) = v. ($)», 

or 1-P(0) - P( I)>0*99 

whence 100(u+IX2". The least positive integral value ofn 
satisfying this inequality is n = H]. 

11. In a game of n throws of a die, for what value of n is 
the probability of getting at least two 6\y larger than £ ? 

[Hint : n co(l)° (f)» + 

=> /i> 10]. 

12. Is the sum of two independent binomial variates a bino~ 
mial variate ? If not, what are the conditions under which it is so ? 

'. Show that for the binomial distribution, the factorial 
moments about the mean are 

Pa)^npq and /* (3) = -2npq(p+ 1) 

[Hint. p r gives : 

Pa) - p'(3) — P'(i) + 2P' 3 ll) — 2p , (1) \ 

i 1. Shew that the mg/of the binomial variate about the? 
mean is (qe~ vt -r pe 9t ) n 

where the symbols have their usual m eaning. Hence or other¬ 
wise obtain (3j and \ for this distribution. (Agra B,Sc. 66) 

6*9. The Poisson Distribution Let X be a binomially distri¬ 
buted random variable with parameter p (based on n repetitions 
°f a* experiment). That is, 

P( X=r) = n c r p r q n ~ r 

Theorem 4. Suppose that as n—> co, np = m (const), or equiva¬ 
lently. as /j->cc, p~>0 such that np-+m, a finite number. Under 
these conditions we have 


Lim 

n—>cr_ 


P<X-r)-™.,r-0. 1 , 2 , 


the Poisson distribution with parameter m. 

Proof. Consider the general expression for the 
probability. P{X -r) = n c r p r q n ~*■ 


binomial 



Special Discrete Probability Distribution 


245 


t * tt m j , , m n — m 

Let np~-m. Hence p= — and l—p= 1-=-• 

72 n n 

Replacing all terms involving p by their equivalent expression 
an terms m/n t we find 

n{n—\) . (n —r+ 


P(X=r) = 


r i 




-m ()] 

Now let m->oo and p-*0 in such a way that np-+m. 
RecaH (.-^) 


=> e~ m as m—► oo 


Lim 


Then 

«->oo 


m m r 


■r ! 


That is, in the limit we obtain the Poisson distribution with 


parameter m. 

Remark. The binomial probabilities may be approximeted 
with the probabilities of the Poisson distribution whenever n is 
large and p is small. 

e~ m n\ r 

Tn view of the assumed smallness of p, P(X= r)= is 

# • 

'sometimes referred to as the formula for the probabilities of rare 
events. 

The application of Frisson distribution is advised in the follow¬ 
ing cases. 

(a) the number of rail road accidents in some unit of time. 

(b) the number of insurance claims in some unit of time. 

(c) the number of deaths by the kicks of a horse in some unit 
•of time. 

Definition. A random variable X is said to follow a Poisson 
distribution if it assumes only positive values and its probability 
mass function is given by 

p[x , nt} = P{X= r) = -y|— for r=0, 1,2,... 

“0. other wise. 

We call m as the parameter of the Poisson distribution and 

*i>0. 



246 


Mathematical Statistics 


The following examples provide interesting case** of Poisson 
variates : 

(1) The number of accidents of cars at a busy traffic crossing 
in time t. 

(2) The number of suicides or deaths by heart attack in 
time t. 

(3) The number of electrons released from the cathode of a 
vacuum tube in time /. 

(f) [Application from Astronomy]—The number of stars 
found in a portion of the Milky Way having volume K, where we 
assume that in the portion considered, the density of stars, say A, 
is constant i e. in a volume of V cubic units one would find on the 
average XV stars. 

Note : P[Xy - n]=( W) n e~^/n \ 

(5) [Application from Biology] —The number of cells visible 
under the microscope, where the visible surface area under the 
microscope is given by A square units. 

Exercises 

1. Suppose that the number of calls received by one operator 
in a particular 5-minute interval from 10-J0 am. to 10-35 a.m. 
follows Poisson distribution with mean = 4. Find the probability 
that on a further working day, the operator will receive in this 

interval of time 

(i) not more than one call 
(ii) four or more calls. Given e -1 - 0 0183. 

Hint. Here m— , P{X=0 or 1) = 27 

r«o r ! 

P(X> 4) = 1-S --£ 

r-o r ! 

2. At a busy tiafiic intersection the probability, p, of an 
individual car having an accident is very small, say />=0 000l. 
However during a cert tin part of the day, say between 4 p.m. and 
6 p m , a large number of cars pass through the intersection, say 
something like /i = H)00. Under these conditions, what is the 
probability of two or more accidents occurring during that period ? 




Special Discrete Probability Distributions 

fllint. Here «/>=m=0*l. Thus P{X=r) 


247 


,-0 1 


fon r 


Hence i 5 02) 


1-S 

r-3 


e 0,1 D r __ 1 _ e -o i (3 H-0-1) 


r ! 


= 0-0045 


] 


3. Give some examples of the occurrence cf Poisson distri¬ 
bution in different fields. [Delhi M.A. 60, Patna 5 ] 

4. Comment on the statement : ‘Poisson distribution is o 

such frequent occurrence that it is not proper to consider it only 
-as a limiting case of the binomial distribution . 

5. Compound Binomial distribution. Suppose that the pro¬ 
bability of an insect laying x eggs is — j- and that the probabi 

lity of an egg developing is p. Assuming natural independence 
of the eggs, show that the probability of a total oi r survivors 
given by the Poisson distribution with parameter ^ g c 

e" x A° * r (l 

Hint. P{X=r)=S —r *CrP T {\~P> 


[' 


*-r 


x ! 


\\-p) r\ ( 
Let j=x—r, then 


summation on x — r, r4-1, 30 

' x 1 ,i(i—p> •']* 


r) \ 






summati n on j-- 0, 1, 


• • • 


©0 




(1-P)A 


e ~r\ (,\n) r l 

r j ^A) e /- l J 

Ex 6. Define clearly conditional probability, conditional 

‘distribution and conditional expectation. Illustrate your dclin, j‘°™' 
<you may confine yourself to discrete distributions) X is a l o. .or 
variate with parameter m and Y is another discrete variate w u s 
conditional distribution for given X is defined by 

PlY=r | X=x)=*c,p r {i-P)*~ r , r=0, 1 , .,•••» *• 

Then the unconditional distribution ot Y is a Poisson 

variate with parameter mp. . 

[Bombay 68, Poona 71, Kashmir 70, Patna 66} 



248 


Mathematical Statistics 


[Hint. P(Y=rHX=x) =p{V= r } p[X= x ) 

_e~"' m x t x \ _ t x 

- TT [ r ) P < 11 -P )X ~’■ 

Thu 

summation on x=0, 1, 2, ...» oo 

Then use Ex. 5] 

6-10. Four moments of the Poisson distribntion with ft, ft, y, 
and y 2 . 

We are given P(X=r)= e -?~, r=0, 1,2,.... 
where m is the parameter of the Poisson distribution. 

u'-rtY\— * „ e ~ m ™ r _ m r ~ l 

ri ^\a)~ z r — — e 'mS. - —, summation on r 


rl 


(r-D l 


=e m m 2 —■— — = e~ m me m =n\ 

Thus the mean of the Poisson distribution is m 
IH‘ = E{x l )=>2 r°- 

r\ r J 

= e"'” 2 ['•(r-l)-fr] ^r, summation on r=0, 1, 2,... 


V?i 



+ W 


°° «r-l ] 
r~o (r-1) »J 


= nj 2 + m, 

therefore a 2 =^)=£’(A' 2 )-E 2 (A') = m 2 +m~m 2 =m 

Thus the variance of the Poisson distribution is m. 

Note. Observe the interesting property which a Poisson 
random variable possesses: its expectation equals its variance. 

/i 3 , summation on r=0, 1, 2,...oo 

writing ;-3 = r(r-l)(r-2) + 3r(r-l)-br we have 
Pa=m* + 3m 2 -\-m 


oo 

Pi=e~ m 27 

r~o 


r 4 m r 
r f 


Since r*=r(r — 1 )(r-2)(r - 3 ) + 6r(r—1 ) (r-2) + 7r(r- 1 ) + r 
Pt = m* + 6m 3 + 7m* *- m 


Special Discrete Probability Distribution 


249 


The central moments and ix 4 are given by 

/*3=fV-3#*//V + 2,*V 
= m z + 3m 2 +m — 3m 2 (m -f l) + 2w 3 
— m 

/* 4 =/V“4 # i 3 >i # +6/x 8 > 1 '*-3/x 1 # * 

—m 4 4-6wi 3 +7w 2 + w— 4m(m 3 -f 3m 2 + m) 

-b6m 2 —m(m-f 1)— 3m 1 
= 3wi 2 +m. 

Hence P 1 =^=i;A=^= 3 + „* • 

£f ^^ 00 , ^->0 and p 2 ->3, that is, the Poisson distribution 
tends to the normal distribution. 

6 11. Recurrence relation for the moments of the Poisson distribution 

, dPr 

Theorem 5. P r +\ =rrn * lr ~ 1 + m ~din 


where m is the parameter. 

Proof. p r =E{x—nty=S (x-m) r 


,-m m 9 
X 1 


summation on * = 0, 1, 2,...°° 

=Z 1 fx ! {[x-ni) r (e~ m m *;} 

^'=27— ,[(*— w) r e~ m m*~ x (x-ni)-r[x— /w) r “ , (e" m /« - ; 
dm x !|_ J 


=z[(x- 


m) 


r+1 


e m m*~ l 


x t 


■r(x — m) r ~ l 


,-m nx » 

X i 


] 


d^r 

Therefore m -j~ — l x r+i — rm /z, +1 . 


dm 


dfi r 

dm 


Then H-, ¥l =rm P r -i +»» 
completing the proof. 

Particular cases. For r= mp 0 + m 


dn 


=m, since /* 0 =1, Pi=0. 
For r=2, p a =2mni + m d ~f = m . 


Forr=3, Pt—3mp 2 +tn~ = 3m 2 +m. 

H e„ ce ft=g=J-.A=^=3+./« 

G = -J- ~~~ ^ • 



250 


Mathematical Statistics 


Exercises 


Ex. 1. (•) \/£i ($*—3)ra<r=l. (ii) ma yiV 2 = l 

Now VPi^Pi— 3)m<r = J 3+^ — 3^m.\/'w = I 

Also 7i = +v'^i=’\/(l/ ,;, )» Va = ^ 2 ~3 = 3-f l/m—3= I/m 
mo 7iya='«v / '”\ / (>/'”). l/m = 1. 

E x. 2. Obtain the factorial moment of the r/A order for the 
Poisson distribution and deduce the first four moments. 

[Hint. x< r >=x(x-l)...(x-r+l), x ! = x (r >. (x-r) ! 


OO 


/*'< r) = £(xt'>)=27 


m 


m 


x lr > 


°o tn x ~ r 
e~ m m r 27 


,-r (x-r) ! ' 

= (<r m ni r )(e m )=m r . 

Now / z 'd) = P\=m 

ft,' = £(.y 2 ) =JE7[.v c -> +x] = E(x {2 >) + £'x) = m* + m 

1*3 =E(x 3 ) = E[x l2) -|- 3x°> +x]=m 3 4--m 2 +m 

^!' = £( x 4 ) = £ [ x (4) + fix ,3) + 7x (2 > + x] = m* -+ fi m 3 -f 7m 2 4- m. 

Ex. 3. Show that m a Poisson distribution with unit mean, 
mean deviation about mean is (2/e) times the standard deviation. 

(Agra M Sc. 72J 

[Hint : Here m = l and P(.Y=x) = e- 1 /x ! 

g(i*-i D.f CLL ^n i,,-,.g 


«-D 

Let j — x — 1, then 


x I 


£( \ x-l | )-e * + 27 summation on j= 1, 2,...,do 


a-2 i. (.X ) ! 


= 27 ( -- -—!- ) =e - 

V./! (y+i) iy 


e 


j+ 1)!. 

Hence the result follows because, if mean is unity so must 
be variance and h-.nce s.d equals unity]. 

6-n. Mode of the Pois on distribution, we have 


/V) = £(.Y=r) = 


til 


n 


r ! 


, r = 0, 1, 2, .. 


Mode is that value of the random variate r for which P(X=r) 
is greatest. 


Speical Discrete Probability Distribution 


251 


e -m /„'+!_ m 

we have ?('•+»= (ri _— = (r+ i, 




ni 


m 


or 


P(r+\) = y7r?( r ) 


...d) 


...( 2 ) 


If r is the most probable value then 
(i) P(r ) ^ P{r+1) and (ii) P(r)^P(r-\) 

Using (1) for (i) and (ii) we get 

r 4 -l^m or r^ni — 1 and rKtin. 

Hence combining the results for /• we obtain 

If r happens to be an integer, there shall be two modes, m 

and m -1, otherwise the integer r shall satisfy (2). ^ 

Ex 4 A Poisson distribution has a double mode at r- 
and r=2; what is the probability that r will have one or the other 

of these two values. 

[Hint, when m — 1 = r = 1, m = 2 
hence P{r= 1 ) = <-•'"" (2/1 !) = 2 /e~ 

( 2) 2 

When m=r = 2, we get P(r=2) = e~- -yT” 2 ^* 
hence P[r = \ or r = 2] =-- P{r= 1)4 /V = 2) = 4/<r] 

613. Additive property of Poisson variates. If two indtpen 
dent variates, X and Y, have poissonian distribution w,th means 
ntt and m ti then X + Y is a poissonian variate with mean »!> + »>» 

Proof. P{X=s) = e and P(Y~r-s) = ? 


(r — s) ! 


’in 


tnJ 



Hence P(X-Y Y =-■ r) = T, 

• o , 

where r = 0, 1, 2,... 

since X and / are independent variates. 

Clearly the above expression in brackets is the probability 
that simultaneously * will have the value 5 and y the value r 5. 

Thus P[X+Y=r)=e ^ Hr-*) - !* 

summation on s= 0, 1, 2, • ••, r 
— (m, + /n*) («»i + w« r 

= e rr~- 

Consequently A'-b T is a Poissonian variate with mean + 
Kt.if the independent vt.ria.es have Poisson,an 



252 


Mathematical Statistics 


distributions with means m t (i*=l,..., n), their sum is a Poissonian 
variate with mean Z ni t . 

Note: P(X+Y=r)=*P(X=0, Y=r) + P(X=\, Y=r-l) 

+ P(X=s, Y=r-s) + ...+P(X=r t y= 0) 

=Z p(x=s, y=r—s)} = 2 P(x=s) P(y=r—s), runs from 
0 to r. 

6 14. Moment generating function and cumulative function 
of the Poisson distribution. 

Keep in mind Ex. 3. (p 213), A/*(f)=exp {m(e'— 1)}. 

Then C*(/) = log Mx(t)=m(e t -l\=m(t 1 1*/2 l + t 3 / 3 !+..), 
which provides : k r ~ coeff. of t r jr ! in Cx(t)=m , Vr. 

This shows that all the cumulants of the poisson distribution 
are equal to m. 


615. Normal approximation to the Poisson distribution. 

Theorem 6. The standarized variable Y= - has a distri - 

V m 

but ion that approaches normal distribution as rn-> oo. 

Proof. Now Mx-m{t)= £{exp {X— m)t}~exp (—mt) E (exp 
A?)=exp (— mt) exp 1). 

Replacing t by t/ x /m leads to find 

My (t) = M^ x _m)i^ /m (t)=^s. p i—mt/y/m) exp { m (e^ m —\)} 

= exp ( — t\/m) exp {m[e t ^ m — l)=exp {me^ m 

— m—ty/m} 

*=exp (m + t\/m+t*/2+t*/3 ! \ m + ...—m—t^/m) 

= exp (/ 2 /2+/ 3 /3 ! V'”}. 

t 2 J2 

Letting m-+oo => Afy(t) => e? , completing the proof. 

616. Fitting the Poisson distribution. 

To fit a Poisson distribution to a given data we usually make 
use of the result that m is the mean of the distribution. 

The mean of the data is taken as the value of m. 

For the theoretical frequencies, we use the recursion formula. 


viz. / 7 (.v+l) = 


i -in 


// I ** 1 


___ m e "'ni* 

(.v + 1) 1 .y-H at ! 


m 


* + l 


P(x) f 


x-0, 1. 2, 


or f(x T1) 


m 


x + l 


/(•y), ,v=0, 1,2,.... 


Special Discrete Probability Distribution 


253 


Ex. 1. In 1000 extensive sets of trials for an event of small 
probability the frequency f of the number x of successes are found 
to be 

x : 0 1 234567 

/ : 305 365 210 80 28 9 2 1 

Assuming it to b t a Poisson distribution , fit a Poisson curve 
to the above data. Compare the variance of the actual and observed 
distribution. (Delhi (Hons) 64] 


[ 


Hint 


Here m = 


Z fx 120! 


N 


1000 

1 2719 

2 =Var */**-»-flag 


= 1 '2 approx 


— 1 44= 1 2 79 


e -m ==e -l-2 = Q 3012 

The expected frequencies for * = 0, 1, 2, 3, 4, 5, 6, 7 are res¬ 
pectively. 

f'0) = N.e~ ,n —■ 1000 '0 3012) = 30P2 
/(!) = (! .2) /(0) =3614 approx. 

/(2)= (^V' 1 ^ 216 ' 8 - /(3)= (¥“) 

/ ( 4 ) = (^-)/( 3 ) = 26'0, /( 5 ) = (’ 2 2 -)/( 4 ) = 6-2 
/(= /(5)=l-2, /(7)=(i~)/(6) = 0'20 

The variance of the actual distribution =m= 1.201 
and the variance of the observed distribution= 1.279] 

Ex. 2. If X and Y are independent Poisson variables, then 
the conditional distribution of X, given X j-Y, is binomial. 

(Bombay, M.Sc. 70, Mcernt B Sc. (lions) 68, Gujrat 70 

Jiwaji 72, Kerala 70] 

, P{X - r. X V- Y^n) 

(Hint. P[X=r | X Y=n}= + 

_ P}X^r)P(Y ^ n — i) e % and Y are indepdent 
PiX+Y=n) 

_e~ m 1 mf e~ m 2 m 2 n ~ r / — (Wff/Wg) (m,A-m.,) n 

r ! ’ (n — r) ! ! n '• 

where we have assumed that A'and Y are independent. 

Poisson variables with parameters ni\ and m 2 respectively. 

n ! m, r nu n ~ r 


r \ (n — r) l (rw 1 -i-w»a) , ‘ 



254 


Mathematical Statistics 


_(»W_2l_Y(-S s-V' 

V r JXnii + nul \Mi + nt 9 f 

V 9 -r f 

\ r /’ J + 

which is the probability function of a binomial distribution with 
parameters r; and p\. 


Exercises 

1. Prove that the sum of two independent Poisson variate 
is a Poisson \ariate. Is it true of the difference ? 

[Hint. Let X u X 2 be independent random variables. Suppose 
that X u X 2 have a Poisson distribution with parameter ni x , m t> 

My (0 = S c ,X »- = u 

A 1 A°0 -*1 ‘ 

Afx% (t) = i ,ni ~^ C *^and A/xi+Xa (/) = A/xi (t). Mxzit) 

* /M s )(e*~ I) js the m g y of a random variable 

with Poisson distribution having parameter Wj-f-tfij. 

^,-.v a d) = £-(/' v ') 

—/«! —«i a + /Wie* + /»a « * 

— e 

This is not the m g f of the Poisson variate. The difference can 
not be Poisson viriate is evident from the fact that it may have 
negative or positive values]. 

2. Find the probabili'y that at most 5 defective fuses will 

be found in a box of 200 fuses if experience shows that 2% of 
such fuies tire defective, [t— 1 —0 01 S3] (Agra M,Sc. 67] 

3. A car-hire firm has two c.irs, which it hires out day by 
day. The number of demands for a car on each day is distributed 
as a Poisson distribution with mean 1 5. Calculate the proportion 
of days on which neither car is used and the proportion of days 

on which some demand is refuse.', r e _1 ' 5 =0 , 223Il 

[Acs. 0*2231, 0 1913] 

4 In a certain factory turning out razor blades, there s a 
small chance (1/500) for any blade to be tefective. The blades are 
supplied in packets of 10. Use the Poisson distribution to calculate 
the approximate number of pickets containing no defective, one 


Special Discrete Probability Distribution 



defective and two defective blades respectively in a consignment 
of 10,000 packets, (e" 0 ' 2 ^ 0 9802) [Agra 56] 

[Ans. m = ’02; 9802, 196 and 2 packets lespectivclyi 

5. Prove Poisson recursion formula : 

P(x+ 1, m)=[m,(.v+l)] P(x, m) where P( 0, m) = e-' ! ‘ and use 
it in the example : 

In 1000 extensive sets of trials for an event of una'I probabi¬ 
lity, the frequency / of the number x of successes proved to be 

x : 0 1 234567 

/; 305 365 210 80 28 9 2 1 

Assuming it to be a Poisson distribution, fit a Poisson curve 
to the above data and test the goodness of fit. Compare the vari¬ 
ance of the actual and observed distribution. [Delhi (Hons), 64] 

6. In 1000 consecuti e issues of the 'Utopian Seven Daily 
Chronicle’ the deaths of centenari .ns were recorded, the number 
jt having frequency f according to the table : 


x: 0 1 2345*78 

f: 229 325 257 119 50 17 2 1 0 

Show that the distribution is roughly Poissonian by calculating 

its mean, and then the frequencies in the Poissonian divribuuon 

with the same mean and the same total frequency of 1000. I-ind 

also the variance of the given distribulipn. (e 1 2231) 

[Agra M Sc 511 

7. Six coins are tossed If00 times. Using the Foi.-oon distri¬ 
bution, what is the approximate probability of getting five heads 
x times? [Bombay B Sc 68] 




„-50 


Ans. 


IF 


X. A manufacturer of cotter pins knows that 5°' 0 of his 
product is defective: If he sells colter pins m Mxes of 100 and 
guaruntees that not more than 10-pu.s will be defective, what is 
the approximate probability that a box will fail to meet the 
guaranteed quality. [Mysore B Sc 64] 

[Hint. P(X> 1 . ) = I - P{ X< 1 0) = 1 J 

9. If A'is a Poissonian variate with mean m, what would 

be the expectation of e~ k * kx where k is a constant ? 

[I A. S. 48] 



25* 


Mathematical Statisicts 


OO 




Hint. E(e^.kx)^ e~* m .kx 

x«0 


m 


OO 

Ae""« 27 


(/HC~*)® 

;-o(jf-D! 

-k 


— ke 


7/J 


, - k{ % (m e~ k )^ 

(we A ) S 


i U—0! 


= mA e~ m ~ k .e 


me 


= mk. e 


m(e~ k — 1) — k 


] 


10. If a; and y are independent Poisson variates with means 
m and m' respectively, prove that the probability that x—y has 
the value r is the coefficient of i r in 

exp {mt+m't- 1 —m—m*} [Delhi Hons 60] 

11. Jf m is the parameter of a Poisson variate, show that the 
probability that the value of the variate taken at random is even 
or odd are respectively 

e~ m cosh w, e~ m sinh m [Punjab M.Sc. 53] 

12. Criticize the following statement : 

The mean of a Poisson distribution is 5, while the standard 
deviati n is 4. 

1?. lfxis a Poisson variable with mean m, show that 
(x—m)jy/m is a variable with mean zero and variance unity Find 
the m g.f. for this variable and show that it approaches 

e " as m-+ co . [Bombay B.Sc. 67] 

14. Obtain Poisson distribution as a limiting case of binomial 

distribution, stating the conditions clearly [D'dhi H^ns 67 65] 

15. How is the Poisson distribution related to the negative 

binomial distribution ? [Punjab M Sc. 58] 

16. (a) Give an example of the Poisson distribution expl¬ 
aining the underlying stochastic model responsible for it. 

[I A. S. 5*>] 

(b) “Poisson distribution is of such frequent occeerrence 
that it is not proper to consider it as only a limiting from of the 
binomial distribution.” Comment on the statement. [IAS. 61] 

17. Let X have a Poisson distribution with parameter w>0. 
If r is a non-negative integer and if /V = E[X r ), prove that 


r+i 


dm / [Bombay B.Sc. 66] 

18. Let X\ (/ = !, 2. ..,//) he independent random variables and 
let each of them have the same Poisson distribution defined by. 


Special Discrete Probability Distribution 


257 


P~ m fJi T 

P{Xi — r) — , (r=0, 1,2,...) 

Then the distribution of the statistics x is 
.P(3c) = exp (— mn)(mn) r J(r/n) !, r = 0, 1 In, 2/n ... 




6‘17. The Geometric Distribution with its E(X), VtX), Mxit) 
Suppose we perform on experiment £ and are concerned only 
about the occurrence or non-occurrence of some event A. The 
experiment E is performed repeatedly and the repetitions are in¬ 
dependent. We assume further that on each repetition P(A)=p 
and P (not A) i.e. P (A~)=\—p=q remain the same. Suppose that 
we repeat the experiment until A occurs for the fiist time. (Let 
us note that have we depart fram the assumptions leading to the 

binomial distribution. In the case of the binomial distribution the 
number of repetitions was perfixed. whereas here it is a random 
variable) 

Let X be the random variable X defining the number of repeti¬ 
tions required up to and including the first occurence of the event 

A. Thus X assumes the infinite possible values 1, 2. Suppose 

rth repetition results in A for the first tima and the first (r—-1) 


repetitions of E result in A (complement of A), then 

P(X=r)=q r ~\ r=l, 2, ... 

A random variable with this probability distribution is said to 
have a geometric dirtribution. 

Clearly P(X=r) > 0, and for r= 1, 2,...,oo, 

S P(X=r) =p (l+?-H/ a -t- ”)=W0-<7) = l- 

d 

Now E(X) = 'Z r pq r ~ x =p E q r 


in view of Eq r being convergent for I q I <1 

d / 1 \ P 1 

~ P dq (!-?)“ P 

V(X) = E(X 2 )—E 2 (X) 

Now E(X-)=2 7 r z p q'~ x =-Z [r(r- l) + r] pq r ~ l 

=27 rir-Dpq'-'+l.r pq r ~ x 




dq 2 


q r + E(X) 




<P_ 
dq 2 




258 


Mathematical Statistics 


Thus 



The moment generating function Is given by 


Mx(t) = E(e ,x )=Z e ir q r_1 p [qe'f 

r>l H r® 1 

If we restrict ourselves to those values of t for which 
0 <qe* < 1. [that is, / < log (l/<?)] then we may sum the above 
series as a geometric series and otoain 




P 


pe* 




and 


q 1 -qe 

mm -°f 

U) U-D 

"-v-k-y 


l±? 

P* 


The Negative Binomial (or Pascal) distribution with mean and 
variance. 

Definition. The negative binomial is defined by 

P(X=x)= (~ x k ) />* (~q) k , x=0 t 1,2,..., 

Equivalently, P{X=x) = q~* 9) a 

Mean and variance of the Pascal distribution 

We consider P(Y=r)= jj P k H r ~ h * r=k t k+ !*••• 

Let Z^number of repetitions required upto the first occu¬ 
rrence of A. ... _ . 

Z a = num’:er of repetitions required between first occu¬ 
rrence of A upto and including the second occurr¬ 
ence of A. 

Za = number of repetitions required between the [k— -1) 

occurrence upto and including the /cth occurrence 

°f A. . 

Thus we see that all the Z<’s are independent random variables 

each having a geometric distribution. Since 

y=z,+...+z* 


Special Discrete Probability Distribution 


259 


£(y)= J + . + J=L 

Since E{Y) t =’E{Zx)4- 


+ E(Z k ) 


V (Y)^V(Z l ) ¥- + V(Z t ) ^ 

Ex. The probability that an experiment will succeed is 4/5. 
If the experiment is repeated until four successful outcomes have 
occurred, find the expected number of repetitions required. 

[Hint. E(Y)=k/p = 4/{4/5) = 5] 

Moment generating function of the negative binomial distri¬ 


bution. 

Evidently P{X=x)=-q~ k 


ikA-x- 1) ! (p_y 
x ! (A: — 1) ! W / * 


x=0, 1, 2,... 


Mx{t) — E(e ,X ) 
Mx(t)=E(e tX ) = .S'/"* 


(k- t-v- 


-n f_PY' x 

-0 ! { 7/ ’ 


x 1 (k 

summation on x=0, 1, 2,...,oo 
(k+x- 


-«'■ rfSSM t)’ 

?M\ 


Now E(X) = --\ t==0 = kpe‘(<l-pe ‘) 

d 2 M 




= kp 


t“ 0 


E(X 2 ) = 


dt* 


, t , ,,-(*+!) 

= kpe * ( q—pe‘) 


t -6 


+ k(k+D p l e" A (q-pe 1 ) 


— (k-\-2 


e-o 


= kp-\-k (/c-M) p 2 , , , f 

and V(X)=o 2 ~kp+k ik + \)p 2 -k~p 2 =kp ± kp 2 =kpq. 

Let ue note the similarity of this m g.f: and these numbers to 
those of the positive binomial distribution. 

Note. As <7 > 1, kp < kpq i.e. Mean < variance for the 

negative binomial distribution. 

Ex. For the negative binomial iq— p)~ k , where q—p = \. 

Cumulative function Kit)—— A: log [1— pie 1 — l)] Hence deduce 

that kj=kp, kz—kpq. Find k % and k x , , 

Poisson Distribution as a special limiting case of the negative 

binomial distribution. 

If we let p -* 0 and k -► oo in such a way that Lim kp = m, 

then V' m q~ k ~T7^~7n (“?)*• where 9 “ 1 +/> 

p—>Q 




260 


Mathematical Statistics 


: Lim (1 
k-> cc x ! 

P~> 0 


t^ulL 1 (I\ 

(k- 1) ! I q) 


-'l :( >+™) W+-C-I) (ft+Jr-2)...* 


m ,+ *t 

('+'-.)"■( i+ ?r 


Lim r m 
/r->oc l l+ i 


—e 


-m 


x ! 


Ex. How is the Poisson distributed related to the negative 
binomial distribution ? 

The hypergeometric distribution 

A discrete random variable having probability distribution 

. )■ '-»• •- 

is said to have a hypergeometric distribution. 

Altenatively there is an urn containing N balls, of which Np 
are white, r the'probability of the number of white balls contained 
in a sample of size n drawn without replacement is given by (1) 
above. 

Let Np=a and Nq=b so that N=a+b (since p+q= 1). 

Henc e W=C)m/(") 

I, iS easi l y S ee„| /W =i{(“)( „!,)}+(*) 


mo 


1 . 


J he result ( “ ) („ _*J = (°+* ) is obviously 

equating the coefficient of y n in (H-y)° (y+l) ft =(l+y) a+6 

Properties of the bypergeometric distribution. 

(a) E(X) = np 

(b) nX) = npq ! ~ l 


seen on 


Special Discrete Probability Distribution 

(c) P(X=x) e* ( " ) p* (l -/>)«-* for large N. 

Proof. The mean of the hypergeometric is given by 

s m -l ,(»;)(,*) 

p-n 

— ^ /a+A-l\ _ U-l / 

^ 71-1 / ^ ( # ) -A^/7. ri/N=np 

Next £'(^ 2 ) = £‘lA'(A r -l)] + £(X) 


Now 


w-D]-p^£,*(*-«) 


of<2— 1) n 


(jf ifct)(/-) 


and 


Therefore E(X Z ) = up (Np— I) —~ -f np 

V<X) = E'X*)—[EiX)¥ 

tin 


N — 1 
_ np 


[{Np{'i-\)+N-n)-np[N—\)) 

N-n 


ft „ . (bfq-nq)=ttpq 


N - l 


For large N 
V(X)~ i en x ~ ' ,IN , * 1 - ll/bi , 

1 J '^^TTTv npq ' Mnce i~//v^ L 

npq is the variance of/* the of binomial probability distribu 
Also (X)—np, wh^h is the mean of the binomial distnbi 



262 


Mathematical Statistics 


for every N. Hence the hypergeometric distribution approximates 
to the binomial distribution for large N . 

The approximation is very good if n/N ^0 1. 


We can illustrate the meaning of (c) with the following simple 
example. Suppose that we want to evaluate P(^T=0). 

For n= 2, we obtain from the hypergeometric distribution 

^-°) N N- 1 V 1 N )\ 1 N—\) 

From the binomial distribution we obtain P(X=0)=q i . 

It may be noted that 1—^-=1 —p = q while^l— 18 


almost equal to q 

Note : Property (c) states that if the lot size N is sufficiently 
large, the distribution of X may be approximated by the binomial 
distribution. This is intuitively reasonable. 


6* 18 Tbe Multinomial Distribution. Finally we consider an 
important higher-dimensional variable, which may be thought of 
as a generalisation of the binomial distribution. 

Definition. The multinomial distribution is defined by 


P(X \— Hj, X% — //;>•••, 





n\ _ 

!...«*! 




t 

where X n t =n. 

<-i 


Let us recall that the terms of the binomial distribution were 
obtainable from the expansion of the binomial expression. 


(p-\-q) n = X 

r 3 0 



p r q n ~ r . 


In an analogous way, the above 


probabilities (I) may be obtained from an expansion of the multi¬ 
nomial expression 

(/>i+^a+ +/’*)". 

The argument for deriving the expression (I) above is identical 
to the one used to obtain the binomial probabilities we must simply 
observe here that the number of wavs of arranging n objects, /»j 

of which are of one kind,/;* of which are of second kind, .., 
of which are of a k rli kind is given by 

n ! 


! f>t ! ... m ! 


Special Discrete Probability Distribution 


263 


If k=2, the above reduces to the binomial distribution and 
the two possible events are “success” and “failure”. 

Suppose that (X u ..., Xt) has a multinomial distribution given 
by (1) above. Then 

E{X f )—np and V(X i ) = np i (l —pi), /=!, 2, . , k. 

This is an immediate consequence of the observation that 
-each Xi as defined above has a binomial distribution, with pro¬ 
bability of success i.e. the occurrence of A t equal to p { . 

The means , variances, covariances of the multinomial distri¬ 
bution can be calculated as follows. 


we have (/h♦••+/?*)".——i A ' P* *•••/>* 

= 1 since />!+/>*+...+/>*= I 
E{Xt—nt)=2J {Xt =n()P(Xi =/ - i, A \—tii >•••» X k =n k ) 

= 2 /’(/ii, « 2 > —. **) 






/i ! 


! 


•Pi 


*1 


Pk 


nh 


—Pi^rr (/ ? i +/ 7 2 +-**+/>*)"• 

OPi 

=npt (p l +P i + ...+Pk) n ~'=np< 
€{Xi Xj)—D n 4 m P(n u n z ,... y n k ) 


— Pifipj ^ n * ll 2y-i n k) 


—Pijj~ OPi {P 1 +P 2 + •••-\‘Pk) n 1 

=«(«- 1 )/v>, ... + / k ) n ~ 2 =n(n- l)p { pt 

E(X t 2 ) = £ X t 2 P(X U ..., X k ) 

’=2 n i Z P{ni, n 2 , n k ) 


0 

= Pi 2 ^iP{U\, ...,!/*) 

0 

«= «/>,(/>! + — +/>*)"’ 1 +w(/f -1 (Pi + ••• +f , i ) T, ~ 2 

—npi + n{n — \)pd. 

Therefore 

K(A' < ;=£(A' < 2 )-[£:(A' < )] 2 =^ < +/i(«~i)/^r-»7h 2 


<Cov (X u X,)=E(XiX,)-E{Xi)E{X,) 


=n(n—l)Pt pt-npt.npi 
— - "?< Pi 



264 


Mathematical Statistics 


Note : These results can be used for finding the variances and 
covariances of linear functions of frequencies in /< classes as below. 

V(h X t + ... + hX k) =2 If V(X<)+222 if i Cov {X u Xs) 

=2 If np t { 1 -p t ) -J-2SS IJj i-nPiPi) 

="& If Pi-{I, liPif } 


Cov {(/jA'i-f-...-f UX/c), {miXt f-... +nt k Xif)} 

=n{2 l i m i p i —(S /</><) (2 mPt)} 

Moment Geraerating Fonction of the multinomial Distribution. 

t k ) = E {exp (2< ti Xi)} 


= 2 


n 


exp {2 t t t n t ) pi' 1 •■■pt'*, since Xt=n t 


n k 


n i 


nt 



(pi exp t ) 



(^*exp tt, n k 


= (Px e h + ...+ p k e tk ) n . 

Ex. S/ww that the m.g.f. of the multinomial distribution is 

k 

<Pi e' 1 + ■ +/V'* )”. S P, = 1 

i— 1 

and from this deduce that the means, variances and covariance are 
given by 

E(ni)=npu l/ (n,) = npi{l-p i ), 

Cov (n it Hj) = —npiPu i^j. 


Exercises 


1. The conditional distributions of independent Poisson vari¬ 
ables *i, a' 2> .... Xk for the given distribution ot their sum A a:, 

-f-... -f-is a multinomial distribution with index .v and the proba¬ 
bility in each class equal to I /k. 



P{x x r a x k 



P[\ i.y, .. Xt ) 

P[-\) 


*■ " * m x . e~ ml (>nk , a 



2. Discuss the nidiginal and condit onal distributions 
ciated wiih the multinomial distribution. If («,, ;/ 2 , 
multinomial distribution with parameters //; /»,, p l% p^) 


aSso- 
have a 
and if 


Special Discrete Probability Distributions 


265 


c i y d iy /= 1 , 2 ,..., k are constants, find the variance of 

1 k k 

Z c t n i and covariance between 27 r< n t and S d t n t 
l==1 i=l /= 1 

[M Sc. Bombay 69] 

3. A rod of specified length is manufactured. Suppose that 
the actual length X (inches) is a random variable uniformly dist¬ 
ributed over [10, 12]. The three events are defined as below : 

^, = {^<10 5}, ^ 2 = {I0-5<x<18 8 }, and A* = {X>\\ 8 } 

If 10 such rods are manufactured, find the probability of 
obtaining exactly 5 rods of length less than 10 5 inches and exactly 

2 of length greater than 118 inches. 


[Hint. p l =P(A 1 ) = j 


10*5 


Pt—P{Ao) 


10 

= [ Il *8 


i 

12—To 

1 


Ps 


J J0 5 
r \2 

= P(A Z )=\ . 

J 1 1 8 



12-10 


dx =0 25 


dx = 0 65 


dx=0‘ 1. 


Hence lequired probability= s ~j ™^ , (0-25) 5 (0 65) 3 (0 l ) 2 1. 



7 

Continuous Probability 

Distributions 


7*1. Definition. When the range space Rx of a random vari¬ 
able is an interval or a collection of intervals, we say that A* is a 
continuous random variable. 

The probability density function f of a continuous random 
variable, denoted by pdf is a function / satisfying the following 
conditions. 

(a)/(.Y)^=0 foraUxe/?* 

(b> U /wrfx=i - 

Further more, we define for any c<d (in Rx) 

P{c<X<d)=Y c f (x)dx • 

The cumulative distribution function F of the random variable 


X (abbreviated as cdf) is defined by the relation F(x) = P(X^x). 
If X is a continuous random variable with p.d.f.f , 


= f / {s)ds. 

J -oo 

Properties of c.d.f. 


(a) The function F is non decreasing. That is if x 1 <x t , we 
have F(Xi)^F{Xt). 

For proving this we consider the events A and B as follows : 
.4 = {A'<.Yi}, B={X^x t }. Then since .V|^.y 2 . we have AcB and 
hence P(A)^P(B) or else /*(*!)<.F(.Y a ). 


Lim 


(W 7™ M = 0 and Lim F(*) = t. We 

write this as — = 0, F(») = |. This is easily seen for 

Lim 


often 


F(ao ) ~ 


X-+ OO 


/ {s)Js = 1 . 


- X) 



Continuous Probability Distribution 


267 


(c) Let F be the c d f of a continuous random variable with 
p.d.f f Then 

Ax)= d Tx F( x) . 

for all x at which F is differentiable. 

Proof. F(x)=P(X^x) = \ f{s)ds. Then applying the 

J -OO 

fundamental theorem of the calculus we obtain 

F'(x)=f (x). 

Examples. 

Ex. 1. The function defined as follows is a density function. 
/(*) = 0 x<2 

= (1/18)(3+2x) 2<x<4 

= 0 x>4 

For /(x)>0 for every x in the given interval 

f 2 f 4 i f . 

f(x)dx= 1 Qdx+ 1 jg (3 + 2x) dx+ J 0J.v=*l 

Also P( 2<x<3) 

=\W & {i+2x)JX =h 

Ex. 2. Through a point B{ 0, a) a straight line is drawn in a 
direction taken at random in the interval ^ = — tt/4 to 0 = r./4, 6 
being the inclination of the line to BO. Find the probability distri¬ 
bution of the intercept x on the x axis. 


Evidently 0=tan _1 — so that d0= ° f -- r • 

a a-+x* 

The probability that 0 will fall in the interval dd is 


d:M2) = 


2 dO 


2 a 


71 


=f{x) dx, — 
Evidently ' ^° 


dx 


dx = 1. 


-a 


Hence the probability density function for the distribution 
of the variable x is 

2 a 


/(*)=r 


a^x^a. 


r.[a‘-tx~) 

For, when 0 falls in the interval dO, x falls in the corresponding 


interval dx. 

1 his distribution 


i> called Cauchy's disli ibuiioti. 



268 


Mathematical Statistics 


Ex. 3. For a continuous frequency distribution , the mean 
deviation from the median is less than that measured from any 
other value. 

The mean deviation about an arbitrary point a is given by 


F{a)=[ j x—a | f(x)dx where ( f(x)dx— 1 

J-30 J-CO 


( 1 ) 


oo 


or T(a) = j ^ (a-x)f(x)dx+ ( x—a)f(x)dx . 
Differentiating with respect to a under the sign of integration 


gives. 


^ f(x)dx— f{x)dx 


F”(a)=f(a)+f(a)=2f(a)>0 
if /(x) is assumed not to have zero value at the median. 

£ Recall the result, f(x, a )</*] = J* <*) j x 

—/(<*, a) ^ where/(x, a) is a continuous function possessing 

02 / 02 /* 

continuous partial derivatives ■- and a and £ are differ- 

9x0a 030X 

entiable functions of a]. 

Now F(a) is minimum when F'(a)=0 

f(x)dx= 1 f(x)dx. 


i.e. 


-OO 


OO 


which combined with l /(x)r/x=l 

J-oo 

or 1 f(x)dx-\- ( f(x)dx = 1 

J-GO Jfl 

yields ( f(x)dx = l= ( f{x)dx . 

J-OO Jfl 

This leads to the conclusion that a is the median value. 


Exercises 


1 . Verify that the following is a distribution functic n : 

F{X)= 0, x <—a 

= — a<x<a 

== + 1 ’ v><1 IP.UM.A. 63] 

2 Explain the terms («) probability differential, (ii) probabi- 
lily density function, (iii) distribution function. (Delhi Hons 57] 


Continuous Probability Distributions 269 

[probability differential is dF{X)—f (x) dx. By distribution 

function or cumulative distribution function we mean F where 
*Yx)=P(jr<x)]. 

3. A bombing plane carrying 3 bombs flies directly above a 

rail road track. If a bomb falls within 40 ft. of the track, the 

track will be sufficiently damaged to disrupt the traffic. With a 
certain bomb sight the points of impact of a bomb have the pdf: 
/(*) = (I00-J-*)/10,000 when — 100 < x < 0 

= (100—x)/10,000 when 0 < * < 100 

elsewhere. 

where * represents the vertical direction (in feet) from the 

, ■ n t 1S case. Find the distribu¬ 

tion function. If all 5 bombs are used, find the chance that the 
track will be damaged. [Delhi (Hons) 57J 

l p ( I x | < 40) = / > (—40 < *40) 

[0 JOO+x f40 J00—x 16 

3 —40 10,000 " + J 0 \ojm dx== 25 

Probability that the track will be damaged 

—t('-S)t- 

5. The length of life X (in hours) of a certain type of length 

ulb may be supposed to be a continuous random variable with 
density function f(x), given by 

f(x)=a/x 3 , 1500 < x < 2500 

elsewhere. 

Determine the constant a and the distribution of A'; Compute 
the probabilities: p 1 

P(1800 < x < 2000); P(700 < x < 1900 | 1500 < x < 2000) 

. IB. Sc. (Bom.) 67J 

o. It /x r exists, then u-, exists for all 1 < s < r i t e. If E(x r ) 
exists, then E(x') exists for all 1 < s < r. 

_ ^ [M. Sc. Nagpur 68] 

[ £ ( U l*)={_ oo U|*rfF(x)= 1*1 • dF(x) 

+ ! 1*1 > i W’ "W 

< L l 1 * 1 ' df{x) + \> x \ > , I * l r dF(x) 

since | x I * < I x | ' for .s < r and I x | > 1 
=► E{ | x | ')<£■( | x | ')]. 



270 


Afathematical Statistics 


7. A continuous curmulative distribution F(X) is defined as 
follows : 

Fix)* 0 x<\ 

= 1/16 (x-1) 4 1 < x < 3 

= 1 x > 3 

Find the probability density fix). Find the mean of x and 
the median. [B. Sc. (Brnn) 64] 

[Ans. /M=i <*-l>\ 1 < * < 3. 81 

= 0, otherwise Med>an=l + IS' »] 

8. A variate has the density function 

/(x) = 1/16(3 -fx) 2 . “3 < x < -1 

= 1/16 ( 6 - 2 x 2 ), -1 < x < 1 

= 1/16 (2-x) 2 , 1 < x < 3. . 

Find the mean and the standard deviation of the distribution. 

[Ans. Mean=3, S.D.= 1] 

[M. Sc. (Agra) 48] 

9 (a) A random variable x takes values 0, 1, 2,.with 

probability proportional to (x+1) (1/5)* Find the probability 

that x < 5. [B. A (Hons) Delhi 661 

(b) Calculate the coefficient of variation for the rect¬ 
angular distribution on [0, b], given that the probability law for 

the rectangular distribution is P (x < t) = l/b. 

(Ans 100xl/,/3) [ ®. A - (M-d) 54] 

10 Express the constants y 0 , a and p (0 < p < 1) 01 n 

distribution 

dF=y, ( 1-'^) ’ dx, —a < x < a 

in terms of . and p,. |B. A. (Hons) Delhi 62] 

11. A random variable x is known to have the distribution 

( x \ w_1 —mxla 

1+ d e 

Find the constant c and the first four moments about the 
mean. Determine the linear relation between Pi and of this 

distribution. Also find the mgf. about the origin. 

[M. A. (stat) Delhi 68] 


dx , — a < x < 


[ 


•—n i 


Ans. 


Q‘ 


.a 


c=m 


m 


'a Tm' /il ^ m ' ** m* 


3a 4 (m+2) 


nr 



Continuous Probability Distribution 


27] 


2,3a — 30! = 6, m gf = e~ al (I 4 atlm)~ m ] 

12. The probabihty law of a continuous random variable x 

is f(x)=y 0 e b a \ a x < oo 

where a, b, y 0 are constants Show tint y 0 =rb-o — \ and a = m-n 

where m and a are the mean and standard deviation of the distri¬ 
bution. Show also that (3^3, [ 3 2 —9 

(M. A. (Delhi) 57, B. A. (Hons) Delhi 56. 60] 

13. Verify that the following are probability density 
functions : 

(a) f(x) = \x~, 0<x<l 

= H* 2 -3(x-l)H 1 <. x < 2 

= \ [x 2 -3(x-l)*-f3(x-2) 2 ], 2 < x < 3 

l= 0» elsewhere 

(b) /(*)=—. j-q 7 ^- 2 . — cc <. x < oo 


14. A probability curve y=f(x) has a range fromOtooo. 
lff(x)-=e ", find the mean and variance and third moment about 

mean ' (Patna 61) 

Ans. [mean = I, ,i 2 ' =2, /z' s =6, *t a =l M3 =2J 

15. For the rectangular distribution : 

dF=dx. 0 < x < 1, 

Pi = i* variance= 1/12 and mean deviation about 

mean = T 


(Mysore 71, I. A. S. 571 

16. Let/(x)=y 0 * (2-*), 0 < x < 2. 

Then /V=l, Pi'—6/5, ,V=8/5, ^'=16/7 
P>= 1/5, P* = 0, ^ = 3/35, ft=0, ft=l5/7 
Pzn+i=0, mean deviation about mean = 3/8, 
mode=l, median=l anp H — 2/3 

(Delhi B. Sc. 71J 

17. Prove that the geometic mean G of the distribution 

dF=6{2—x) (x— 1) dx, 1 < x < 2 
is given by 6 log (16(7) = 19 [Delhi B Sc. 71) 


7 2 Functions of Random variables : Suppose that y=H(x) 
is a real-valued function of x Then Y=H(X) is a random vari¬ 
able whenever A' is a random variable. Let 5 be a sample space 

associated with an experiment and let A' be a random variable 
defined on S. Then 



272 


Mathematical Statistics 


x=X{s) for every G S 

and y=H[X{s)] 

The range space of X i e, the set of all possible values of the 
function X, we shall denote by Rx and the range space of the 
random variable Y ie. the set of all possible values of T, we shall 

denote by Ry. 

Let C be an event (subset) associated with Ry, asdescnbed 
above. Let the event B C Rx be defined as 

B = {x G Rx' H (x) G C) 

The events B and C related in this way are called equivalent 
events. In words : B is the set of all values of X such that 
//(*) G C. 

For any event C C Ry we define the probability of C as 
follows; 

P(C) = P[{.v G Rx : H{x) 6 0] 

In words : The probability of an event associated with the 
range space of Y is defined as the probability of the equivalent 
event (in terms of x) as given by Eq. (1). Eq. (1) can be written 
alternatively as 

P(C)=P[{x G Rx : H(x) G C}] 

= P[{s G S : H (X(C s) G QJ *-v 2 ) 

Ex. 1 Let X be a continuous random variable with pdf 

f{x)=e~\ x > 0. 

Find the P {T ^ 5) where Y=H{X)=2X+ 1. 

Sol. Here Rx = {x \ x > 0}, Ry = {y I ^ > 0 

Nowy ^ 5 if and only if 2« + l > 5 which in turn 
yields x > 2. Hence the event C=[Y > S} 
is equivalent to the event B={X ^ 2} 

Thus P{Y ^ 5} = P ( 2X+1 > 5} = F {X > 2} 



Ex. 2. If X is a random variable having a continuous probability 
density function and F(X) denote the probability that X is less than 
or equal to x, then show that F{x) is a random variate having 
a rectangular distribution in the range (0, 1). 

(M. A. (Delhi) 55] 

We have P {X < x) = F(x)= f f{x) dx 

J -OO 

Put u = F(x) so that du!dx=F'{x)’=f{x) 


Continuous Probability Distribution 


273 


Then dF—f{x) dx 
=/(*) % du 


_/(*) ^_ fix) 

-jr dx ~ W) 

dx 



=du t 0 < u ^ 1 since 0 < F (x) <. 1 


Hence the distribution is transformed into the very simple 
“rectangular” form in which all values of the variable from 0 to 1 
are equally frequent. 

Note. This example illustrates that pdf of any continuous 
frequency distribution where cdf is F can be transformed to the 
uniform density f{x) = 1, 0 < u < 1. 

bp the substitution u=F(x). 

The transformation u = F(X) is called the probability trans¬ 
formation . 


Exercises 

1. Suppose that X is uniformly distributed over (1, 3). 

Find the pdf of the following random variables (a) Y=3X+4 

and (b) Z=e x . 

Ans. (a) g(y) = \ff, 7 <y < 13, (b) h (z)=l/2z, 

e < z < e 5 ] 

2. Suppose that X is uniformly distributed over the interval 
(0, 1). Find the pdf of the following random variables 

(a) Y=X 2 +1 and (b) Z=l/(X+\) 

3. Suppose that the continuous random variable X has pdf 
f(x)=e-*, x > 0. 

Find the pdf of the following random variable (a) Y=X* 

< b) z= uTT)i 

jAns. g(y)= J-.V e~ v ^ .y >0 

4. Suppose that A'is uniformly distributed over ( — 1,1). 
Find the pdf of the following random variables : 

(a) T=sin (r r/2)X (b) Z=oos (nJ2)X (c) W= | X \ 

[Ans, (a) g(y)=| r (l-j ! )~ I/2 ) -1 <y 1. 



274 


Mathematical Statistic* 


(b) = (I-**) 1/2 O < z < 1 

<;) f(w)= 1, 0 < «-<!] 


7 3. Two-and Higher-Dimensional Random Variables. 

Definition. Assume there exists the sample space S associated with 
an experiment. Let X-=--X(s) and Y-Y(s) be two functions each 
assigning a real number to each outcome s G S. We call (X, Y ) 
a two dimensional random variable . 

I f x 1 = X 1 (s), X a =Xi{s) t . ,X n =X„(s) are n functions each 
assigning a real number to every outcome s G5, we call (X iy ...X„) 
an n- dimensional random variable. 

The range space of ( X , Y) denoted by Rxxy is the set of all 
possible values of <X, Y). In the two dimensional case, the range 
space of ( X , Y) w ill be a subset of the Euclidean plane. We shall 
express the functional nature of X and Y by writing, for example, 
P[X a. Y < b] instead of P[X(s) < a, X(x < />’. 

{X, Y) is a two-dimensional cont’nitons random variable if (X, 
Y) can assume all values in some non-countable set of the Eucli¬ 
dean plane For example, (X, Y) assumes all values in the circ'e 

{(x f y) I x 2 +y 2 < I). 

Let (X, Y) be a continuous random variable assuming all 
values in some region R of the Euclidean plane. The joint proba¬ 
bility density function f is a function satisfying the following 
conditions : 

(1) f (x, )’) > & for all (x, y) G R 

(2) jj R f(x,}j dxdy=\ 

The cumulative distribution function (cdf) F of the two dimen¬ 
sional random variable (X, Y) is defined by 

F(x,y)=P(X^ x , y). 

If F is the cdf of a two-dimensional random variable with 


joint pdf f, then 

9 2 F(x.v) 

where F is clilTercntiable 

Marginal and conditional probability distribution 
Let / he the joint pdf of the continuous two dimensional ran¬ 
dom variable ( X , Y). We define g and h, the marginal probability 
density functions of X aud Y, respectively, as follows : 


Continuous Probability distribution 


2 n 5 


g(x)=t /(*. y)dy; h{y)= t fix, y)dx 
J —oo J “OO 

Also 

P(c < X < d) = P[c < d, — CC < y < cc] 

= i - oo 

The conditional pdf of X for given Y=y is defined by 

h(r) > 0 

The conditional pdf of Y for given X=x is defined by 

h{y \ x)J^XL ,gM > o 

The above conditional pdf's satisfy all the requirerment for 
a one-dimensional pdf Thus, for fixed y , we have g(x I y) > 0 
and 

g(* I y)dx— » dx 

_h(y) 

~h(y) =I 

We say that X and Y are independent random variables if and 

only if fix,y) = gix) h(y) for all (x, y), where/ is the joint pdf, 
and g and h are the margial pdf s of A' and, T respectively. 

The two dimensional continuous random variable is uniformly 
distributed over a region R in the Euclidean plane if 

fix, >>)=const. for (x,y)£R 

~ 0 elsewhere. 

We assume that R is a region with finite, non-zero area. 

! oo roo 

fix, y)dx dy=\ 

oo J OO 

the above implies that the constant equals 1/arca (i?). 


Ex. 1. Show that the two-dimensional continuous random vari¬ 
able (x, y) has joint pdf given by 

fix, y)=x*+xyl 3, 0<x<l,0<*<2 
= 0, elsewhere. 

Find P[X+Y)> 1}. 





276 


Mathematical Statistics 


\ o (3 +r) dy ~^f y+y v$> = r + i2 = 

P(X+Y2* I) = l -P(X+Y< 1) 

-■-/iji'* ('■+?)** 

x(l-x) 2 j 


1 


1 




*)+ 


= i _ 1__ 65 

72 “"72 





Ex. 2. // f( x ,y)=2-x-y , 0< x <1, 0 < 1 

= 0 otherwise 

find the marginal density functions, conditional density functions , 
var (A'), var (Y} and coefficient of correlation between X and Y. 

[M. A. (Delhi) 63. M. A. (Stat) Delhi 59, M. Sc. (Gauh 68] 

(') g(x)= f 1 (2—x—y) dy =>3/2—x 

J 0 

//(>»)=( (2-x-y)dx=3/2-y. 

J 0 

00 g(x t y) = lfJ±= 2 -*^y 

hiy) 3/2-y 


h(y | x) = ^ii’ ) = 2 -^ 

1 g(x) 3/2 —x 

(iii) *=fM 0 = J[ x(3/2-x)dx = 5/l2 f 

J = m' 10 =| o .v(3/2 —y)dy=5/l2 

Hence var (AT)= [* (x-5/12) 2 (3/2-x) </x= — 

Jo 144 

(I')=(“ O'-5/J2) ! (3/2-j) 

Jo i44 


var 


(iv) 


£(A'K)= j | xv(2—x—j) dy dx 


_ 1 . X 3 

”7 x “~6 


1 1 
0 ^ 


therefore Cov(Af, r) = £(A , r)-xy = —= - _L 

6 144 144 


Continuous Probability Distributions 


277 


and = ( —i/144)/(l1/144) = — J/J1, 

o at y 

Ex. 3. If fix, y) = 24y (I-x), 0<j<x) 

= 0, otherwise 

find the marginal and conditional distribution functions. Are the 
variates X and Y independent ? 


Now g(x)= \ fix,y) dy= ( 24y{\ —x) dy 

J-oo Jo 

= I2x 2 (l—x), 0<*^l 

/l(y)= S-oo y ) dx= j 0 Z4y{\— X )dx=\2y{\—y)*, 

A*, y) _ 24y(\-x) _ 2 V 1 —x) 


g(x I y) = 
My I x) = 


12_y( I - _y) 2 (l-.y) 2 ’ ^ 

/ (x. v) = 24y( \ —x) 2y 
g(AT) 12 .V* ( 1 —A'J X 2 * (0 ^ <JC )- 


Since fix, y)^=g(x)h(y), X and Y are not independent variates. 

Ex. 4. // a, b, ab — Ir are positive, determine the value of k 
for which ; 

fix, y)=k. exp [ — (ax 2 4- 2hxy 4- by 2 )] 
is a bivariate frequency density over — oc^x^oo; —- oc <^j»^oc. 

— (ax 2 -\-2hxy -\- by 2 ) 


roo [co r oo r* 

I I fi x , y)dxdy=k\ 

J -oo ) oo J-oo J- 


oo 


dx dy= 1 


The expression ax 2 -\-2hxy + by* = ] /a {a z x 2 -\-2ahxy+aby 2 ] 
= 1 ta[{ax + hy) 1 -j- {ab — hr)y 2 ] 

Hence integral 




OO _ (nh — 


{ab — h 2 )y 2 /a 


oo 


oo 


that is k. 


^Recall j e 

\/ir 


— C 2 X 2 


I 


-?i 


— [ax-\-hy) l !a dx 




Exercises 

1. The two-dimensional random variable (X, Y) has joint 
P.d.f. fix, y) = kx(x-y), 0<x<Z, — x<y<x 

= 0, else where 

(a) Evaluate the constant k (b) Find the marginal p.d.f. of 
X and of Y. 

lAns x = l/8, g{x)=x*/4 t 0<x<2, 



278 


Mathematical Statistics 


T + j ’~ 2 <'< 2 ] 

2. The joint density function of a bivariate distribution is 
given as follows ; 

f(x,y)=x+y, 0<x<l, 0<y<l 

=0, otherwise. 

Determine the marginal distributions and the covariance bet¬ 
ween X and Y. [Punjab M.A. 63} 

[Ans. g[x)=x + i. h{y)=y + \]. 

3. If X and Y are two random variables, having joint density 
functions 

f(x,y) s =\l& (6— x-y) t 0<x<2, 2<y<4 

= 0, otherwise 

Find (a) ^<l,f<3) ( b ) / > (A r +F<3) and (c) P(X<\ I T<3) 

IB.Sc (Bom) 69J 

[Ans. (a) 3/8, (b) 5/24, (c) 3/5]. 

4. The joint p.d f. of the two-dimensional random variable 
( X , Y) is given by 

A*, y)=x 2 +xyl 2, 0<jc< 1; 0<y<2 
= 0, else where. 

Compute the following 

(a) P(X>i); (b) P( Y <X); (c) P(Y<\ \ X<\) 

[Ans. (a) 5/6 (b) 7/24 (c) 5/32] 

5. The joint p.d f of (X, Y) is given by 

/(.v, y) = e~ v t for ,r>0, y>x 
= 0, else where. 

Find the marginal p.d f. of X and Y, and evaluate 

P(X< 2 I y==4). 

7 4. Characteristic function. We now desciihe the function, 
denoted by <f>x (t) called the characteristic function of the random 
variable X , which has the property that a knowledge of <f>x ( t) 
serves to specily the probability law- of the random variable X. 
We define </>(/) as below. 

<f>x(t) = E (e"* )=> | e itX d/ (.x) 

- -00 

Also <£*(/)=» £■(<?"*) = £[cos («)|+/£[sin f/.v)] 

The characteristic function is defined for all values of the real 
variable /. 

For we have 

| e ux I =1 since | | 2 ^(cos a) 2 + (sin r*) 2 =»l 


Continuous Probability Distribution 


279 


I co r 

e i,K f{x) dx | <\ f(x)dx=l 

-OO J 

Consequently, the characteristic function always exists, 

The moments of ihe random variable X that exists may be 
■obtained from a knowledge of the characteristic function by the 
formula 


^r=EX'}=±£M0). 

From a knowledge of the characteristic function of a random 
variable we may obtain a knowledge of its distribution function, 
its pdf (if it exists) and many other expectations. 

Basic property of the characteristic functions. 

jf <f>x(()—<f>y[t) for every real number t, then 

P a^X^b] = P[a^Y^b\ => Fx(a) = F y b) for all real num¬ 
bers a and b. 

Also ^ x} 1 ^ 

if Xi and X 2 are independent. 

Bacause of this property the characteristic functions represent 
the ideal tool for the study of the problem of addition of indepen¬ 
dent random variables. 

The following properties of characteristic functions may be 
‘noted. 


(«) <*(0) = 1 
(ii) I </•(') I <1 


(iii) <£(-j)=m£(o where </>u7 denotes the complex number 
conjugate to <£(/). 

Ex. 1. // X is a normalized norm.il variable, it has the pro¬ 

bability density 



--- exp (—x-/2) 

v 


and its characteristic function is given by 

</»x(f) = exp (— &*) 


we have 


<f>x(t) = 


_1 f°° 

y/(/.TT) J_y3 


(exp itx (exp ( — l x : )) dx 




OG 

£ 


fi' U 


(itx) n 
n 1 


exp (— \x')dx 



280 


Mathematical Statistics 


~ (//)* 


1 r°° 

2 ^j J *"exp (-* 2 /2) dx 


n-o n ! V(27t) J_ 00 

The interchange of the order of summation and integration 

is justified by the tact that the infinite series is dominated by the 
integrable function exp { \ tx \ — £ * 2 } 

00 -lx* 


Let us note that when r is odd 


when n is even, say 2m them f°° 

J-oo 


i: 


x* m e 


x" e 

oo 

~\X* 


dx =0 and 


dx 


IOO 

=2 J 

Jo 


•2m 




dx= 


(2m) ? 
2°* m ! 


Thus <t>x(t) = 2 

,n=o 2m ! 2 m m ! 


CO . 

=2 (-1 t* m ' 


y - 

' m 




Remark. If 

then «/) = £■ (/•*)=£( e , ' (ar+ *) > 

— p itb F(J taY \ itb J. / Y 

— e £ (e )=e 4>y{at) 

In view of this result, if X has p d f. f(.x) 
1 0 -\{{x—m)l{oy 


and if we define Y=~ _ — 


°v'(2n) 

or X=cY-\-m then ^r(/)=exp (init — l a i t-) 
Ex. 2. If X has Cauchy's distribution 

1 a 

~ * -, — CO<*<oo 


/(*) = 


— O | / I 


t: 

then its characteristic function is e 

00 cos (tx ) 

* d " rxm ' J-» «■ + .. j 

Since the integrand in the second integral is an odd function 
of, for an, a«d ,. and the integra, converge, absolute,we h av" 


TO--«r. ss"- 


i 

k 

i 


,^.v = 0 


.•2 


-x> O' +.v 

^ e f ecal! the known formula 

“SHjrnrt , | m| 

o ! 1 - A- 2 


Continuous Probability Distributions 


281 


and then M )=—. 2 P cos { 'V dx 

« Jo 

_ 2 f°° cos (tx) l ~ —a | r | 

] 0 1 + (Jt/a)- ~ n 7 ’ "2* 

-flU I 

=e 


since a>0. 


Ex. 3. T/ - A' /jas Laplace's distribution 

fU) = f- 

its characteristic function is 

•> 
a~ 


<f>x{t) 


a*+t s 


itY 

Now **(/)-£(« J=l 


00 e"*.JL 

oo 2 


its 


jo 


-oo 


-a | x | 


cos{tx)dx-\-i 


i 


OO 


-oo 


a I x I .. 


sin (tx)dx 


} 


therefore 


The integrand in the second integral is an odd function and 
30 -a\x \ 


and 


i: 


sin (tx)dx=0 


oo 


—ax . \ I 
e cos » tx)dx 


_ f ^ -a * (' s ’ n tx—a cos rx) 


= a f 


a*-H“ 


oo 


a 2 


a--i t- 


We state the following theorems without proof. 


Theorem 1. Suppose X is a continuous random variable with 
pd.f f(x) and characteristic function <f>x(t). Then 

IL * x(,) e ~ ,xl 

holds for every x which is a point of continuity of f{x) 

fOO 

Working rule : If I <J>x[t) I dt < co then 

J-00 

f(x )= f e~ ttx <f>x[t) dt, for any real x 
*~J-00 

Theorem 2. If a continuous random variable X has the chara¬ 
cteristic function <f>x it), then its distribution function satisfies the 
relationship 






ll 




282 


Mathematical Statistics 


is e 


Ex. 1. The distribution for which the characteristic function 

* 1 * has the density function. 

IM A. (Delhi) 66, 68 Agra 70) 

. i I 

/<*)=-• ttp 

1 

we have /(*)= — l 

2nJ_ao 


. ~“ X e~ ' ' 1 * 


" Vc ' -i: e ,,Xe ' td '] 




GO 


= —[ e~* cos tx dt (Integrate by parts) 

2ttJo 

= 1 /n-x 2 f(X) 

Hence /(*) = — • * , -oo<x<oo. 

71 I A 

Ex’ 2 Find the density function f{x) corresponding to the 
characteristic function defined as follows : 

<t>(0=i- \t \ , I / I <1 

> 1 [M.A. Delhi 53] 


# 

1 / 1 

k 1 


^2r.. 

L 

' r 

r ( 

2tt| 

j-. 

=L[ 

• 

/ _ 

2r\ 

X 


s= h,\l°- l il+,)e ~“ tJ,+ f <*-') e '‘" ^ ] 


-rW 

X* 


'-*>] 


(integrating by parts) 




Ex 3 If x lt .v 2 , v 3 , x t are ch nee variables whose distribution 
has den ity function given by 

I 


(2rT> e ' P 


|-i(.v, 2 +.v, s +V H.v, 2 )| 

Show that 2 --.V 1 .V 3 — .v 3 .y 4 has the density function given by 

£ c 1 [M.A. Born* 56] 

1 f* 0 f x (*> f M 

^( f ) = rz~rJ \ e v . p [ 1 / (,V| Yj ,v 3 V 4 )] 

( 2 ..; 00 J-o J-vD 

{exp ( —l 2 1 4 xf) d. v a </.v a </.v a J.v 4 



Continuous Probability Distributions 


283 


If 00 


2n ] _oo j -00 

00 poo 


exp {— $ (*i 2 *“ 2/.** 1 Xs + ***} 


x '[ f exp {-J(x,*+2/m,X4+x* ! 5 

J -00 J -00 


1 


1 


VU + '*) VU+' 2 ) 

oc fOO 


1 r r oc roo 

=—i_ Recall 1/r. I exp {— (tf* f + 2/i.Y>>+ &>=), dx dy 

1 -j- / 2 L j-ooJ-oo 


1 


1 


W(ab-li-) i 


Hence 


/w -y 


1 f°° exp( — itz) 


dt 


00 l + f 

Lf 00 SSLUdt-le-* 2 ' 


„ f 00 cos (mx) . 7T 

_ Reca " J 0 ' 1 + .x z " ,tX= 2 


7 7 — | m 

e 




Exercises 


1 . Are e - f2 | sin/| ,/'° S ' , e“ and 1 /( 1 + / 2 ) chara¬ 

cteristic functions ? if not, why not ? It so. find the corresponding 
density functions. IM A. (Bomb), 56, M.A. (Delhi) 55] 

2 Find the frequency density of the distribution for which 
k. = lr- 1H. [M.A (Delhi) 61] 


[flint. Ic(t) =27 ( -^ 7 - (/--l) !--log 0 -it)- 1 

L r-i r ! 

Also k(t)= : log <t> (0 Hence ^(/) = (1— 

This is the characteristic function of the p.d f. 

f{x) = e~ a , 0 <atOo] 

3. Let f(xy)=]/Aa- [\+xy(x i -y 2 )), \ x \ , \ y \ <a, a>0 

= 0 elsewhere 


sin at 


Then <f>x[t )=—— 


«M 0 = 


at 
sin at 
at 


, /x I sin a /] 2 

and henee fr + y(/) = <M')M0 though the variables 
dependent. IMA 


are not ii»- 
(Luck.) 63] 



284 


Mathematical Statistics 


4. Find the distribution function corresponding to the chara¬ 
cteristic function 

<MO=exp {VO-iO-l}. -oo<x<oo 
[Hint. 1 =r cos d, t — r sin 6] 

[M.A. (Bomb) 55, I.C.A.R. 50] 

5. The distribution 

jr*. . 1/1 —COS X \ , 

dpKX) = « 1 -— )*. - 
has the characteristic function 

0 ( 0 = 1 - | / | , | / | <1 

=0 , I / I >1. 


°o <*<oo 


Hint. 


recall 


i 


CO 


sin ax . n 
—— dx= - , «<0 


00 COS fix COS CCX . TZ , 

0 ^ d x= 2 "(a— 0 ), a <0 


By using the identities such as 

2 sint? cos /?i 0 = sin(m + l)—sin(m— 1)0 
and integrating from 0 to cc we get the result]. 

6 . A random variate X is distributed in the Poisson’s form 
with parameter A. If A is a random variate with a characteristic 
function 0(f), show that the characteristic function of X is 


0 


P?) 


7. A variate X has moments /V = 


r l 


[M A. Delhi 65] 
when r is even and 


(r, 2) ! 

(V = 0 when r is odd. Deduce the distribution of the variate from 
its characteristic function. [M a. (Delhi) 59] 

8 . Obtain the characteristic function of the standard bino- 
mial variate Z= ~y/\hpq) us,,a l notations. Hence or other¬ 
wise show that Z tends to normality as n-+x> . 

10 s tate the inversion Theorem, and apply it to find the 

probability density functions, if any. for the characteristic functions- 
(i) 0 (r) — I — I / I for | r | < l 

= 0 for | t i <1 

(iij <£ r r) = 1 /(l — it). 

H D.Tme the characteristic function of a random variable 
and state the conditions necessary for a function to be a chara- 
ctenstic function. 

Obtain the characteristic function of a binomial variate. 


8‘1. Derivation of the normal distribution 
The normal distribution is the limiting form of the Binomial 
distribution when n> the number of trials is very large and neither 
p nor q very small 

In the Binomial distribution with parameters n , p the probabi¬ 
lity P(X=x) = n c a p’q n ~ a , *=0,1, .w and E( Y)=np 

Var (X)=npq. 

Here y/npq means (npq) andy/2- means y/(2n) 

Now we introduce the new variate 


_ x-np -tip < < nq 

\/"pq W”pq y/ ,l PQ 

Now E(Z) — 0, Var(Z) = l and as * ranges from 0 to n, the 

values of z increase by — which decreases k rero as n tends k 

v "/><7 

infinity. Also we note that in the case of lar.e n, — oo ^ z ^ cc 
when neither p nor q is zero. Thus in the limit, the distribution 
of z will tends to the continuous distribution with mean zero and 
variance unity. 


n ! 


Now dp= Lim P(X=x) = Lim _, 

F fl«-oc 1 > n«-oz X\(n-x)l 


p 3 q n ~ x 


Lim _ y/(2isn) r, n e~ n ‘ p x q r 

n-+ cc y/(2-x) x' e -*y/(2n (n-x) n ~ x e~ n+x 


— X 


Lim 


I 


(by Striling’s formula) 

n n+lp9±ll2 jn-x+Ui 


«-*cc y/[2r.npq) x m+%u (n 

Lim 1 |_ 

n-> cc \/(2rot 


pq) ix y ri ‘ z / n-x 

[ftp) \~W I 


1 


Lim _ _ 

n->oz \Z(2nnpq) 'N ’ 


where iV =(-?)’ + ‘' 2 



286 


Mathematical Statistics 


N °' V *' = ' + 2 74 a " d "op 1 Z */»« 

Therefore, log* N = (np+z\/npy + *) log* (1 +zy/q!np) 

•f- (nq —z\/ n pq +i) l°g» (1 zy/pl n P) 


We assume 


I- 

A l"P 


1 and z 


J 


< 1 , 


iff J: 


Hence we have 


lL _ *<1 +0(n-*l*) 

J np 


2 np 


] 


+ 


z 

2 V 


Ui-J 


i 


i + f 
p q 


j + 0(r ,, ‘) 


Hence log ^ 


1 


- in the val- 


When x ranges over 0 to «, the increment ^ npq 

_ Lim 1 

ues of r can be denoted as dz — ^^ v / npq 
Therefore 

■'-'VP.) 

This is the required continuous distribution of r and is called 
the normal distribution. 

Remark 1. If X n denotes the random variable with a Poisson 
distribution in which m = np, and if Z n — Xn—mly/m 


then 


Lim c (•/)— f“ 

n ^oc n{) V2icJ-oc 


_a n 


/2 


[Recall £(X„) = 

Proof. <f> 7 {t) — E(e ) — E[e e 

^n 


m, Var IX„)=">’] 

it\/m j 


~itV'»u I ' ) 

X n Wm) 

it/^/ni 


— e • <t> 

— it\/m . 
= e m (e 


- 1 ) 


Special Continuous Distributions 


287 


itXn oo 
Recall 0 (/) = E[e J-27 


itx n 


m 


*« 


—m 


X n 


A n — 0 

~r> 

m £ 

(me i ‘) Xn 


O 

II 

c 

X n ! 



a- » 


fi 


it 

m{c il -1 ) 




and m(e 


it/\/m 


oo 

-1 )=rwZ - 

* = 3 k 


m -< 


i/VW — 


r 2 


Thus <}>Z n (t) — 



k ! 


m 


fiV)* ~[ 


(it) k 


k ! m (fc 2)/2 


and 


Lim 


<f> 


= -t 2 f 2 
n—> cc Zn C 
This limp is the characteristic function corresponding to the 
normalized normal variable. Hence we conclude that the d. f 
of Z„ converges to the d.f. of the normalized normal variable. 
Since m=np and /»->■ x => m -+ cc, we can say that a Poisson 
distribution with a large mean approximates to a normal dis¬ 
tribution. 

1 - 2*12 


Remark. 2 Consider dP — 


v 


2 " 


dz, — cc <z < cc 


Putting 7 — 


dP= 


x — m 
a 

1 


1 


dz = — dx we obtain 


\(x — m a* 


d. v, x < a < x 


G v / 2 71 

Definition. The random variable X , assuming all real values 
— x < x < co has a normal (or Gaussion; distribution if its pdf 
is of the form 

1 -i(x-mlo) 2 


/(*) = 


cr^/2-n 


, — CC <X < X> 


The parameters m an xg must statisfy the conditions 
— co < m < co,.c > 0. We shall use the notation : X has dis¬ 
tribution N(m, g 2 ) if and only il its probability distribution is 
given by the above equation. 

The normal distribution is very important in Statistics. We 
observe that a large number of random variables occuring in 



288 


Mathemetical Statistics 


many applications have a distribution closely resembling the nor¬ 
mal distribution. 


8 2 Probability of the normal Distribution 

(i) Obsiously f(x) ^0 . We must check that 

! oo 

fix) dx= 1 

oo 


Putting t = (x—m)/o t 

f f(x) dx= - j - [ exp.(—if 2 )—I, say 

j—oo V U r ‘) J — oo 

J_f°° 

2“ J — OO 


CO 

Then I 
j r oo 


oo 


2- — oo — 00 


exp (—t 2 /2) dt J + exp. {—s 2 /2 ds 
exp {—(s 2 -f-/ 2 )/2) ds dt 


ds dt = 


Substitution 
3 (.v. f) 


: s=r cos 0 provides 
dr dO = r drdO. 


the element of area 


3 (r, 0) 

As s and t vary between — oo and -f-co, r varies between 0 
and co, while 9 varies between 0 and 2k. Thus 


I 2 = 2 “J 2 'jo r exp. (-r 2 /2) drdO 


_ 1 2k 

2k t ) 





d0= 1 


Hence 1=1. 

(ii) The graph of /(*) = - * e “ indicates 

That it has the well known bell shape The 
graph of/ is symmetric with respect to m. At X=m the graph of 
/is concave downward. As x ± co,/(.x) -> 0, asymptotically. 
Since f(.x) > 0 for all x this means that for large (positive or nega¬ 
tive) values of x % the graph of is concove upward. The point of 
inflexion, and it is located by solving the equation d 2 f/dx 2 =0 . 
Now d 2 fidx-= 0 => .x ^ m±c. Thus if a is relatively large, the 
graph of/ tends to be “flat”, while if o is small the graph of / 
tends to be quite ‘.packed” 

(iii) The following important probabilistic meaning are 
associated with the parameters m and a. 


Speckii Continuous Distribution 


7%9 


Evidently E(X) = 


1 


cW(2tt) J — oo 


(vide Ex. 2 p. 179) 


°° {l-m)Wdx 


Now E(X 2 )=~——.. 

v aV(27T))-0C 

Letting z=(x—m)/o, we obtain 


= <J 2 


v /(2t;)J —OC 

00 2 -z 2 /2 

z 2 e 


dz+m 2 

1 f°° -z 2 /2 

-»+Vu\. e 


oo 


] 


dz |+m* 


[ 


Note. z 2 e z /2 = (z)(ze Z )and 


i 


ze 


—z 2 /2 


dz=—e 


— V* 


zV2 


\ 


= o 2 +m 2 . 

Hence ?(*)-£(**)-£*(*) 


Tta we find that the two parameters m and o 2 w/zzc/z charac¬ 
terize ihe normal distribution are the expectation and variance of X 
respectiv ely. 

The graph of the pdf of a normally distributed random vari¬ 
able is symmetric about x=m. The steepness of graph is determined 
by a 2 in the sense that if X has distribution N(m 2 , o 2 ) and Y has 
distribution N(m,o 2 2 ) where a x 2 > cr 2 2 , then their pdf's would have 
the relative shapes shown in the figure. 

(iv) If X has a stand¬ 
ardized normal distribution. 

That is, the pdf of X may be 
written as 

m/ x 1 - X *I 2 


V(2 n) 

The importance of the 



standardized normall distribution is the fact that it is tabulated. 
whenever X has distribution N ( m , a 2 ) we can always obtain the 
distribution of x—m/o as JV(0, 1), 

Remark Let X : N[m t a 2 ) and if Y=aX4-b, then 

Y—N(am-\-b, a 2 o 2 ) 



290 


Mathematical Statistics 


We have 5 - e*P {~ 2 i,( y —’ )*}4 | 


1 


exp 


' { zo^a* 


[7- (aw + A)] 2 


! 


V(2kc) | a 

which represents the p<// of a random variable with distribu¬ 
tion N(am-\- b , a 2 o 2 ). 

(v) The mode of the normal distribution 

/(x) =— - 7—7 e exists at x=m. This is obtained from 

V(2 w) 

the equations f(.x)= 0 and /'(x) is negative. 

(vi) Mean deviation from the mean of the normal distribution 
*s given by 


OO 


2 cr 


function 


| x — m 

OO 

1 <x v /( 2 -> 

Vl2i=) j 

OO 

\Z\e 

-OO 

f OO 

- Ze 

- 1 Z V* 

J-oo 


20 f 

—Z 2 /2T 

V( 2 ^; L 

J 

fcr. 





</Z 


putting Z=(x—w)/2) 


00 =* /L 

o A/ * 


The moments of odd order about the mean are all zero. 

i f 30 — i / -~ m V 

1 1 * _m\2n4'l e " \ / 


For /* 


2 »»fl 


(x-w) 


ax 


(Jy/2x J _oo 

= 0 , sirce the integrand is an odd function. 
The moments of even order about the mean are given bj 

, roc _i (xzmV 

Uin== wo T ix-mF* e 3 \ </ / dx. 

a V\2r.) J_oo 


*_!_f 

VU*) j- OO 


OO _1-2 

o. Ofs S ■ 

o~ n r* n a 


2 ( 2 


x —m 


) 


30 _ 1-2 

_9n _ 2“ 


- 2°'‘"_ f = 

V(2ti) J„ 

,r?n On f oo 

=-— w n_I/2 e~ M dii {\z 2 —u) 

Jo 

= a 2n f(n + i) ^recall T/j= j a"* x n_1 tfv, /i>oj 


a 2 " 2" 2/7-1 2a-3 3 I ri 

thus • 2 ' * — 2 —"Y ' Y 


(recall Tn-f ]=« Tn and T\ — \/n) 


Special Continuous Disiributions 


291 


= (2rt— 1) (2n—3)...3.1.<x 2 « 

Clearly P 2 n =(2n—1) <J a 
In particular 

t* 2 =0 2 , pi=3o\ fX 6 =15<T 6 ,... 
and ^i=^3 == / x 5 = — =0 

Hence Pi r= ^ =0 > p 2 =^ = 3 

yi=Vfr==o, y 2 =&-3=0 

(viii) (Vide Ex. 5 P 214) The moment generating function ot 

the normal distribution N (m, a 2 ) with regard to origin is 

Mx{t) exp. (mt+\c 2 t~) 

and that about the mean m is 

Mz-mit) =cxp. (Wt 2 )=l+fr 2 t*+(\° 2 t*) 2 l2 !+••• 

+ {\a 2 t 2 ) n Jn ! + ... 

giving: an< ^ .3.5...(2/i 1)® • 

Hence ^= 2 «( 2 «-l)( 2 n- 3 ) ■ 3.1 o 2 "- 1 
da 

and ^3 ^2n__ 2 fl (2n~ 1)(2«—3) ..3.1 a 2n+i 

dc 

Thus 0 V ! «+<’ 5 (2n+D(2«-!)(2»-3)...3.1 

= / X 2n+2* 

Hence for the normal distribution there exists the recurrence 
relation : 

o . 3 dPw 

/* 2»+2 = °*f*a«+° 

Putting «=0, 1 provides : / 4 fl =o 2 , ** 4=° 4 + 204=304 
(ix) The cumulative function of the normal distribution is 

Cx(/) = log Mx{t)=mt+\t*o* 

Thus kit+k .2 ^-j + .«+A:r + 

ki=m,k 2 =<s 2 , Ar 3 =Ar 4 =-= 0 - 

For the normal distribution a// f/ie cumulants after the second 
are equal to zero. 

8 3. Distribution of a sum of independent normal variates. 
Theorem. 1. If the independent variates Xt (/=1> 2,-. ,n) are 
normally distributed with means m« and variances o< a , the variate 
ZctXi is normally distributed with mean Sc<m, and variance £ c< a t . 

Proof. Assume A' is a normal variate with mean m and 
variance c 2 . Then A/x(f)=exp (mt+\o 2 t 2 ) and hence Mcx(t) 



Mathematical Statisicts 


E {exp {cXt))—Mx{ct)=ex p (cmf ^-le 2 ® 2 * 2 ). Consequently 

M (t)-E {exp y EiCtXit)}=E {n* exp (ctXjf)) 

"ZiCiXi 

=Yl t E {exp (c t Xit))~U, M CiX M =Tl * M X f {ct * 
=IIi exp |c<W/)=exp {E c<m *-|-$/ 2 S c< 2 ®i 2 /} 


This is nothing but the m g.f. of a normal distribution with 
mean Ec t m t and variance 2 a* 2 ®* 2 . This proves the theorem. 

4 

Corollary 1. Let m i =m, o t =o, c f =l//i for each r. 
n nc? a* 

Then E c t Xi=x and E c< 2 <x 4 2 =-tt-=—• 

<-i n n 


Hence we have the following important result : 

If the i/dedendent variates X\ (/= 1,2,.. ,») are normally 
distributed about a common mean , m, with a common variance , c 2 , 
their mean x is aiso normally distributed about m, but with variance 
o 2 /n. In symbols : 

X t .N (m, c 2 ) => x*~*N (m, o 2 /n) => ——1) 

a ly/n 


where i = l,2,...,/i. 


that is. 


x—m 

o/\fn 


is standard normal variate. 


Corollary 2. If c 4 =l, then Sc<Af<=2Ar< is distributed nor- 
mally with mean E m t and variance E<j 4 2 . This is called the 
additive property of normal variates. 

Corollary 3. Letting Ci=l, r»= — l, c 3 =C4=...=c n =0, we 
get that the variate x x —x t is normally distributed with mean 
m x —m 2 and variance a^ + ^i 2 . 


Exercises 


1. Find the mean and the standard deviation of the follow¬ 
ing probability distribution 


/(*) = <* 


-(1/24) (* 2 -6x+4) 


oc<x<oc, c being a constant. 
[Ans. mean = 3, s.d^y^] 


2. If two normal universes have the same total frequency but 
the standard deviation of one is k times that of the other, show 
that the maximum frequency of the first is 1 Ik times that of the 
other. 






Special Continuous Distributions 


293 


[ 


Hint. Let the total frequency be N. Then 

x—m^ z 


&(*)= 


N 


G\/ (2^) 


-i ( x = m Y 

\ 0 / * 




h[y) 


ko\f (27r) 


N 


] 


Clearly maximum frequei cy of the variate X is gv /^- X ail( * 

N 

that of Y is ,— r . Hence the result follows 

/ccv (27r) 

3 If a normal distribution is grouped in intervals of total 
frequency N and S is the sum of squares of the frequencies, an 

N 2 

estimate of the standard deviation a is given by 2S\/-' 

[B. Sc. Agra 62j 


_ , f x—m V 

[““*■ >~f ,x!= a7u-r • ~ 


oo<x< cc 


-r. 


•OO 

.<30 

N 2 


x — m )\ 2 


/ dx 


2o\/n * (<r/\/2) VU 


N 2 


1 


oo 


— \ e 


2o\/n * c\(2n) J-x> 
whence a 


, r (SjS)' 

J -OO 

, /x —m\ 


[a'=a/V2j 


_ N 2 1 
2Sy/ nj* 


2a\/ n 

4. For the normal distribution, the quartile deviation, the 
mean deviation and the standard deviation are approximately 
10:12:15. Sc. (Delhi) 67] 

i m _ v2f'rr2 




Hint. 


or 


_ Q* e-*'' 1 *' *r=l 

°V(2rc) )« 

I re./.,-**’ (x/n=E) 


Vi 2 ') 


-f« 

Jo 


hence (? a /<J = 0 6745 from the table (0 3 being the upper quartile). 
Also —QJo — 0'6 45, where Q\ is the lower quartile. 

Therefore quartile delation =0 6745a *a approx.] 

5. For a normal distribution with mean m and variance a- ; 
the central moments satisfy the relation 

/V = (2«-l)®V2«-a 



294 


Mathematical Statistics 


and then Pin+\~ 0 for n=1,2,... 

6. Let and Y be two independent random variables each 
with a distribution N( 0, 1). Find the probability density function 
of Z=a 1 X+a i Y , where a x and o 2 are constants. 

7. Determine the constant c so that 

— dr 

f (x) = c e ^ ,-cc<x<oo, 

satisfies the conditions of being a pdf 

8. Let X and Y be stochastically independent random 
variables, each with a distribution N(0, 1). Let Z = X+Y. Find 
the integral that represents the distribution function G[z ) 

=**P(X+Y^z) of Z. Determine the pdf of Z. 

[Hint. G(z)=P(Z^z)=p(X+Y<Z) = ^ \^z{x)h{y) dxdy 

where 7?={.r f y) | x+y^z}. 


G(z)*= f°° 

J-OO 


2-X 


-OO 


g{x)h(y) dxdy= j g{x) [J ^h{y)d(y) 


G'{z)= j co g[x ) h(z-x)dx t where s( x ) = ^ 2 r.) 

2 ] 


— x z /2 


h(y)= 


l 


V(2tc) 

V. For a certain W(m, o 2 ), the first moment about 10 is 40 
and the fourth moment about 50 is 48. Then the arithmetic mean 
and variance of N (m, c 2 ) are 50 and 4 respectisely. 

8'4. Tables for Ordinates. The normal curve is given by 

y=~ 77T-, exp (-* 2 /2 g 2 ) 


o\/(2 r.) 

Tables have been prepared to give the values of 


1 


-x 2 /2 


%/(2n) 

the origin of x being assumed at the mean. Dividing these values 
by a, we get the values of the ordinates y for the above normal 
curve. 

Fitting of a Normal Curve. The equation of the normal 
curve for total frequency N, or area under the curve is 

Suppose it is desired to fit the normal curve to the given 
symmetrical data. We make the following assumptions 

(i) the area under the curve is equal to jV, the total frequency 
of the given data. 



Special Continuous Distributions 


295 


(ii) ihc mean and the standard deviation calculated from the 
data are taken equal to m and a of the normal distribution. 

The theoretical frequencies may be calculated by using the 
tables of areas or ordinates of the standard normal distribution, 
when N=^ 1 and o=£ 1 the tabular values of y must be multiplied by 
N/o. 

For example the ordinate of the normal curve given by 

_ 1 0.000 -x 2 /32 

^ 4y/ (£•*) 

Corresponding to the x=7 is obtained by finding the tabular value 

of v=—V- e for x=7/-» and then multiplying it by—-—. 
\/{2n) * 

Ex. A set of 10 coins is thrown 1024 times. Find the fre¬ 
quency of (wo successes over the mean. 

The binomial distribution is #(£ + £)"# say 1024(£-b£) 10 
A r - area=tota! frequency = 1024 
m = mean = = i. 10 = 5 
o 2 =variance =npq = 10. |5 
i.e. a =1*581. 

The equation of the normal cur\e with these constants is 

_ 1024 —(x—5) 2 /5 

^ 1*581^(2k> 6 

For the normal distribution the frequency of two successes 
over the mean is 

1024 — (7 — f )*/5 


1*381V( r *) 

x 2 

Enter the table at ^— r58T 


116*1 


= 12650 


Rough interpolation gives tabular y— O' 1792 

0T“92x 1024 
frequency = p38l 


= 1161 


Tabulation of the Normal Distribution. Suppose that the 
random variable X has distribution N( C, 1,. Then 


P(a< t X<b) = -, 


1 


V(2r.) 


6 —-••72 

e 


dx 


This integral can not be evaluated by ordinary means. In 

fact P(X^s) has been tabulated. 

The cdfot the standardized norn al distribution will be con¬ 
sistently denoted by'-l>. That is 


<b(j) 


_J- ^ 

\f (2n) 


-*?r- 


d , 


QO 



296 


Mathematical Statistics 


The function & has been externsively tabulated, and an excerpt 
of such a table is given. We may now use the tabulation of the 
function O in order to evaluate P(a^X^b) where X has the dist¬ 


ribution 



_J_ 

V (2 r.) 




Since P(a<X<.b) = 4>(t>) -fb( a ). 

The particular usefulness of the above tabulation is due to 
tfe fact that if X has any normal distribution N(m l o 2 ) t the 
tabulated function <I> may be 

used to evaluate probabilities 

c ssociated with X. 

We simply note that if A' 
has distribution N[nt x a 2 ), then 

X-m 


Y=- 


— has distribution 


N{0, 1). Hence 



It is also evident from the difinition of O that 



<D(- x) = !-<!>( x) ..(2) 

This relationship is particularly useful since in most tables 
the function 4> is tabulated only for positive values of x. 

Finally we compute P(m~kcr <m + kn) % where X has dist¬ 
ribution N(m,a~), The above probability may b^ expressed in terms 
of the function <1> by writing 

) 

*=0(Ar)—<D( —A:>. 

Using equation (1;, we have for A:>0 

P(m-ko^,X^m-\-ko) = 2<!> (k)- 1. .. (3) 

Note that this probability is independent of m and c. The 
probability that a random variable with distribution ° 2 ) 

takfs valuer w/h n k limes the s.d. of the expected value depends 
on k and 's given by (3) 

In particular, from the tables we have 

P(m-a^X^m + o)P(- 1< X—m/o)^ 1 =0 6826 
and P( I X- m | <g) =1 -F( | X- m | <a) 


P(m—ko^X*^m+ko)=*P ( — k^ 



Special Continuous Distributions 


297 


= 1 —06826. 

= •3174. 

which is less than £. 

In words : the probability that a random value of a normal 
variate will deviate more than a from the mean, is less than 
Similarly P{ I x—m | <2g) =09544 
and P{ | x-m I >2a) = 0 0456 


Thus the probability that a random value of .v will deviate 
more than 2a from the mean is about 4\J°. 

Similarly P( | x—m | >2‘5a) = 0 0124, which is about l} 0 /°. 
of the whole; and that P( \ x—m | >3c)=OOJ27, which is J 0 /° 
of the whole. Jc may be verified and remembered that 

P( | x — m | >/<rr) = r o /° => k = 196 
and P( | x—m \ >ka) = \J° => k = 2‘58 


Ex. Let X have the distribution N(i, 4) Find c such that 
P(X>c) = 2P(X<c) 

X — 3 

we note that- has the distribution N(0, 1). Hence 


P(X>c) 


X— 3^ r—3 




\ 


X-3 <C -3 


) 


Also, P(X<c)=P[-j 3 ' C 3 


rVP?) 


hence the given condition yields 
or <h[(r-3)/2] = £. 


Hence (from the tables of the normal distribution) we find 
that C r f 3 = -0 43, yielding < = 2'!4. 



298 


Mathematical Statistics 


TABLE 1. Ordinates of the Normal Curve 


y= * - e 




For Frequency =Ni and s.d.=a multiply tabular y by Nfa for 
actual ordinate. 



ooo 0 01 0 02 



0 04 005 006 00? 008 0 : 




2 0 
2 I 
2 t 
2 3 

2 4 

2 5 
2 6 


3 2 
3 3 

3 4 

3 6 
3 6 
3 7 
3 8 
3'9 


3989 

3970 

3910 

3814 

3083 

352! 

3332 

3123 

2897 

2601 

2120 
2179 
194 2 
1714 
1497 

1295 
I 109 
•0940 
07 90 
0650 

0540 

0440 

0355 

-0283 

0224 

0175 

0130 

0104 

0079 

0000 

0044 

0033 

0024 

0017 

0012 

0009 

0006 

0004 

0003 

0002 



3802 

3603 

3503 

3312 

3101 

2874 

2637 

2396 

2155 

1919 
1091 
•1476 

•1276 

•1092 

0925 

0775 

•0641 

•0529 

0431 

0317 

0277 

•0219 

0171 

0132 

0101 

•0077 

•0058 

•0013 

0032 

0023 

0017 

0012 

•0008 

•OOOG 

0004 

•0003 

•0002 


3 
3 

3894 

3790 

3653 

3485 

3292 

3079 

2850 

2613 

2371 
2131 
1895 
1669 
1456 

1257 

1074 

0909 

0761 

•0032 

0519 

0122 

•0330 

0270 

0213 

01-67 

0129 

0099 

0075 

0056 

004 2 
0031 
0022 
■0010 
0012 

•0003 

0006 

0004 

0003 

•0002 


3988 

3956 

3885 

3778 

3637 


3986 

3951 

3876 

3765 

3621 


39S1 

3945 

3807 

3752 

3605 

3429 

3230 


3932 

3939 

3857 

3739 

35S9 

3410 

3209 
98 
2756 



■3980 

3932 

3847 

3725 

3572 

•339! 

3187 

2066 

•2732 

2492 

2251 
2012 
1781 
1501 
1354 

1163 

0980 

0833 

•0094 

0573 

0168 

0379 

•0303 

•0241 

•0189 

•0147 

0113 

•0086 

•0065 

0048 

0030 

•0026 

0019 

•0014 

0010 

0007 

0005 

0003 

0U02 

•0002 


3077 

3926 

•3836 

3712 

3555 

•3372 

3166 

•2943 

•2709 

•2468 

•2227 

•1989 

1753 

•1539 

1334 

1145 

•0973 

■0818 

0681 

•0562 

•0469 

0371 

0297 

•0235 

•0184 

•0143 

0110 

0084 

0003 

0047 

0035 

•0026 

0018 

•0013 

0009 

0007 

•0006 

0003 

0002 

•0001 


•3973 

•3918 

•3825 

•3097 

3638 

•3352 

3144 

2920 

•2GS5 

•2444 

2203 

1905 

•1730 

1518 

1315 

1127 

•0967 

•0804 

•0669 

•0551 

•0449 

0363 

0200 

•0229 

•0180 

•0139 
0107 
0081 
0001 
0040 

0034 

•0026 

0018 

•0013 

•0009 

•0006 

•0004 

•0003 

0002 

•0001 



































Special Continuous Distributions 


299 


Worked Examples 

Ex. 1 . Fit tht appropriate normal curve to the follow ing data 
and calculate the theoretical freque tides. 


Breadth c-f leaves in m.m{x) No. of leaves (/) 


12-15 

6 

16-19 

10 

20-23 

22 

24-27 

25 

28-31 

20 ' 

32-35 

12 

36-39 

5 

— 

— 

Total 

100 


Here Z f= 100, x = 2546, <t = 6*04. 

The calculation may be arranged as follows : 


(1) 

(2) 

(3) 

Interval 

. x-x 
X/O i.e .—— 

Area to 

limit 


right 

-OO 

— oo 

1-000 

15-5 

-16 5 

•9505 

195 

-099 

•8389 

23-5 

-0*32 

*6255 

27*5 

0*34 

•3669 

31*5 

TOO 

•1587 

355 

1*66 

•0485 

+ OO 

+ 00 

*0000 


(4) 

(*> 

(6) 

Proportion 

Col (A)X 

Frequencies 

of area 

100 

of the norma 
distribution 

0495 

4*9 

5 

•1116 

11*1 

11 

•2134 

21*3 

21 

•2586 

25-8 

26 

*2082 

208 

21 

•1102 

11*0 

11 

•0485 

4‘8 

5 


100 




300 


Mathematical Statistics 


TABLE 2. Area under the Normal Curve 
The area is measured from the mean x=0, to any ordinate, z 
The result is given for values of x/a at intervals 0*01. 



0 01 0 02 0 03 0 04 0 05 0 06 0 07 0 08 


0 0 
0 1 

0 2 
0 3 
0-4 

00 
0 0 
0-7 
0 8 
09 


onoo 

•0398- 
07 93 
1179 
1054 

1915 

•2257 

•2580 

2881 

3159 

3113 
3013 
3819 
4032 
4 192 

4 33? 
•> 




1591 

1950 
2J91 
261 I 
2910 
3186 

34 38 
3065 
3x69 
4019 
4 207 

4 315 
4 403 
4564 
4 64 9 


080 
478 
S7I 
255 
1C2S 

1985 


2012 

2939 

3212 

3461 

3686 
3888 
4066 
4 222 

4357 
4 4 71 
4573 
4 656 



0120 

0517 

0910 

1293 

1664 

2019 

2357 

2073 

2967 

3238 

34 85 
3708 
3907 
4082 
4 236 

4370 
1485 

•1582 
•1661 
4 732 

4788 
1 S3 1 
4 871 
1901 
4 925 

ion 


0159 
0657 
094 8 
1331 
1700 

2054 

23S9 

2704 

•2995 

3204 

3508 
3729 
3925 
4099 
4 *251 

4 382 
4 4 95 
1591 
•167 1 
•1738 

4 793 

4S3X 
4 875 
19<‘ 1 


0109 
596 
987 
368 
736 

2088 
o 


/"•** V 


239 

036 

026 

106 

772 

123 

454 


I 


3023 

289 


531 
37)9 


959 

0W2 


3770 
3962 
4 131 
4279 

4 100 

4.51 5 
1608 
4680 
47GO 

4 803 
4846 
4881 
4909 
•4931 

194 9 
4 961 
4971 
4979 
4986 

4989 

4992 


279 
675 
064 
443 
•1808 

215 
•248 
2794 
3078 
3340 

3577 

3790 

3980 

4147 


0319 
0714' 
I 103 
1480 
1844 

2190 
2518 
2823 
3106 
3305 

3599 
381(1 
399 7 
4162 
4 300 



‘461 G 
4093 
4 766 

4808 

4850 

4684 

4911 

4932 

4949 

4962 

4972 

4980 

4985 

4989 

4992 


4 SI 2 

4.854 
4887 
4913 
4 934 

4 951 
4903 
4 9 7 3 
4 980 
4 OhG 

4990 

4993 


0359 

0753 

1141 

•1617 

1879 

2224 

2549 

2852 

3133 

3359 

3C2I 
3830 
• 101 5 

4 177 
•1319 

4 14 1 

4545 
•1033 
4700 
4 767 

1-17 

•185 7 
•4 S90 
4 910 
4 930 


4 952 
4 964 
4974 

4 981 

498G 

i 990 
4903 


R 



Explanation. 

i-> C r‘- In t,1C glven distribution the frequencies range from 
i 2 to jy, but as in the normal distribution the range is infinite 

— oo replaces 1 2 and + oo replaces 39, so that the whole of the 
normal distribution may be represented. 

Col. (2). Find the distance, in terms of a, from the mean to 
(the 1 "vcr limit ol each interval, working in class interval uints. 

Thus -, 























Special Continuous Distributions 


301 


Col (3). For. xjo= — 165. area to the right is 


0-5 + 


I 


1-65 _v-2 


x 2 /2 


/Uw) 


dx 


= 9505 

For x/a= -f 0'34, area to the right is 

1 


1 


-• 


5 + 


0 


34 —x 2 /> 

e ‘ ~ 




v (2 

= 1 — [0*5-h 1*331 ]=-3669. 

Col. (4). The entries in this column are obtained by subt¬ 
raction. 

Thus area from — oo to 15 = 1*000 —’9505 = *0495 
„ fram 16 to 19= 9505— 8389 = * 1116. 

Col. (5) Finally, the proportional areas of Col. (4) are multi¬ 
plied by 100—i.e. total area-to find the actual area over each 
interval. 

Col. (6). The required frequencies are then the entries in 
Col. (5) written to the nearest whole number. 

Ex. 2. In a normal distribution , 51°/ 0 of the items are under 
45 and 8 J° are over 64. Find the mean and standerd deviation of 


the distribution. 


[B A (Hons) Delhi 60] 


Let the normal distribution be N(m, a' 2 ). 

_m —45 


Then P 


p(x> 


64 — I 


) v (2r.)}_ 


e * X "r/.v=‘31 ...(1) 


CO 


f 


oo 


04 — mjo 


_1 v2 

* 5 A dx = - 08 


( 2 ) 


From 


(!)/>( 


0<x< 


a/j—45 


j = '50 — '31 = 


19 


Hence consulting the normal tables 


—-- =0 4958 

a 


(3) 


From (2) />( 0< v< 6 -^”^ = -50- 08 = * 

and so from the normal tables 

^=“= 1-1053 

a 

\ 64 — 45 = 0 4858<7-H'4053a 

19=1 9011a => 10 

m _45 = 0'4958x JO 


42 


( 4 ) 


a. 


m 50, 



302 


Mathematical Statistics 


Ex. 3. // skulls are classified as A, B, C according as the 
length , breadth index is tinder 75 between 75 and 80, or over 80, 
find approximately (assuming that the distribution is normal) the 
mean aud standard deviation of a series in which A are 58 J°, B are 
38% and C are 4%, being given that if 


/(') 


I 


i: 




2 


dx 


V(2ti) 

then /(0’20)=0 08 and f (1* 75)=046 [M Sc. (Agra) 60] 

Suppose the normal distribution is N(m, <j 2 ). Then 
75- m\ 1 


J 


( 


)= 


Hence 


V(2tr) 
= 0-08 
75— m 


r. 


( >5 -" , )/° e~ ix2 dx=- 58--50 


Also 


a 

/80-jtA —1 f( 8 °-"0/%-£x a ^ = . 08+ .38=-46 

\ a ) \/(2tt) Jo 


020 


...d) 


Hence 


) v/(2 

80 — m 


= 046 


...( 2 ) 


Solving the equations (1) and (2) we get 

m=74*4 approx, 
and <7 = 3 2 approx. 

Ex. 4. 1000 candidates in an examination w.re grouped into 

three classes /, II and III in descending order of merit. The num¬ 
bers in the first two classes were 50 and 350 respectively. 
The highest and lowest marks in class II were 60 and 50 respec¬ 
tively. Assuming the normality , find the average , s.d. and the 
number of candidates obtaining marks between 43 and 53. 

[B.A. (Hons) Delhi 47] 



Let the normal distribution be N(m, c 2 ) 

Then P ^<^—^=0 6 

50 -m 2 

_[ a <? * x dx = 0 6-0 5 = *100 

V( 2 *) Jo 


Special Coutinuous Distributions 


303 


After interpolation we find ——^=0 254 


- ( 1 ) 


Also P \X< 


( 


60 —m 


<7 

60 — m 


\ - -? 50 = • 
J 1000 


95 


or 


1 


<y e dx='95 —’50 = ‘45 


y/2n Jo 

After interpolation we find 

60 —m 


= 1*65 


.••( 2 ) 


Solving the equation (1) and (2), we get 

m— 48'2 approx, a— 1’ 1 approx 
Number of candidates obtaining marks between 43 and 53 is 
given by 


53 — m 


1000 


V(2tt) J 43 —m 


a — \x z 
e dx 


=1000 j 


I 


s/ (2tc) j o 

= 1*800 {2517 + '•2764} 
=*528. 


°' 68 -lx 2 . , 1 f 0 - 79 -\x 2 . 1 

e dx+ 7e*)\o e dx \ 


Exercises 

1. If X is a normal variate with mean 30 and s.d. 5. Find 
the probabilities that 

(i) 26«.*^40, (ii) | X-30 | >5. 

[B. Sc. (Hons.) Delhi 69] 
[Ans. (i) 0 7653, (ii) 0-3174] 

2 (a) For a normal distribution with mean and variance 9, 
find the value x of the variate such that the probability of the 
variate lying in the interval (2, x) is 0-4115. [B.Sc. (Delhi) 70] 

[Ans. x=605J 

(b) Suppose that a manufacturing process produces washers, 
about 5 per cent of which are defective (say, too large). If 100 
washers are inspected, what is the probability that fewer than 4 
are defective ? (Use normal probability tables). 

[Hint, m = «/? = l00(0'05) = 5, <* 2 = np(l-p) = 4*75 



304 


Mathematical Statistics 


Hence P{X ^)= P (^1. 


< 


X—5 


75 v/4’75 v 4 


3'5 \ 
/ 4* '/5/ 


= <I> (-0 92)-<D (—2*3) 

0 168] 

3 In a distribution, exactly normal, 7% of the items are 
under 35 and 89% are under 63. What are the mean and s.d . of 
the distribution ? [B. Sc. (Hons.) Delhi 68] 

[Ans. n» = 50 29 appro* , a = 10'j6 approx.] 

4. The incomes of a group of 10,000 persons were found to 

be normally distributed with mean = Rs, 750 p.m. ands</=Rs. 50. 
Then the group about S5°/ 0 had income exceeding Rs. 668 and 
only 5% had income exceedine Rs. 832. What was the lowest 
income among the richest 100 ? [Ans. Rs. 866*34] 

5. The local authorities in a certain city installed 2,000 
electric lamps in streets. If the lamps have an average life of 
1,000 burning hours with a s d. of 200 houis. 

(i) What number of lamps might be expected to fail in first 
700 burning houis ? 

(ii) After what period of burning hours would you expect 
that 10°/o of the lamps would have failed ? 

Assume that lives of the lamps are normally distributed. 


Given that if F(t) = 
Then 


1 


— \t 2 , 
e * dt 


and 


V(2^) J-oo 
/ 7 (1 , 50) = 0 933 

/ r (l*28) = 0 900. [Ans. 134 lamps, After 744 hrs.] 
6. Assuming the mean height of soldiers to be 68 22 inches. 


variance 10 8 (inches) 2 , find how many soldiers in a regiment of 
1000 would you expect to be over 6 feet tall. 

(Given : area under the standard normal curve between t/=0 
and n = 0 35 is 0 1368 ; and between i/ = 0 and i/=r5, it is 0 3746) 

[BA. (Hons.) Delhi 58, I A.S. 54] 

[Ans. 125] 

7. In a sample of 1000 cases the mean of a certain test is 14 

and s.d. T5. Assuming the normality of the distribution find 
(i) how many candidates score between 12 and 15 ? (ii) How many 
score above 18 ? (iii) How many score below 8 ? (iv) What is 
the probability that a candidate selected at random will score 
above 15 ? [B.Sc. Lucknow 48] 

[Ans. (i) 444, (ii) 55, (iii) 8, (iv) 0 345] 

8. Two hundred and hfty five metal rods were cut roughly 


Special Continuous Distributions 


305 


6 inches oversize. Finally the lengths of the oversize amount were 
measured exactly and grouped with 1-inch intervals, there being in 
all 12 groups: \Y~Vt\- . llY-W- 

The following distribution for the 255 length was 


Central value x: : 123456 789 10 11 12 
Frequency y : 2 10 19 25 <0 44 41 28 25 15 5 1 

It is required to fit a normal curve to this data. 

9. Calculate the frequencies of the normal distribution which 
has same the total frequency, mean and s.d. as the following 
distribution, for the intervals 10-12, 12-14 etc. 
x : 10- 12- 14- 1 - 18- 20- 22- 24- 26- 

/ : 4 30 106 206 272 219 120 37 6. 

[Ans. Normal treq. : 7, 31, 101,208, 270, 222, 115, 37, 9). 


10. The'following table give" frequency of occurrence of a 
variate „y between certain limits. 


Variate (x) 

Frequency 

Less than 40 

30 

4C or more but less than 50 

33 

50 and more 

*7 

Total 

100 


The distribution is exactly normal. Find the average and s.d. 
of x, and hence the frequency between x = 30 and x=40, and 
between x = 50 and x=60. 

[Ans. x = 46M, o=ll-68, 21*6. 25*3] 

8 5 The Normal Probability Integral or F.rror function. 

'» 2a 2 

For the normal curve y= —: e 

a V(2 7i) 

„ , 1 h -x-h 2 

Wc put 2o 2 =jp- and get y = —_ e 


pi i i <*]= , 

V 71 J -» 

or we may write 


-h 2 x 


2 f (' 

V* Jo 


* —h 2 x 2 . 

e hdx 


<f> (hx) = 


y/n 


- y2 

* } dy. 


• _/,2y2 2 V J 

e hdx, where <£(y) = -j- 

0 V 71 J o 

<f>(y) is known as the error function or the probability integral. 
The values of P for different valus of y have been tabulated. 

In the theory of errors, it is customary to take the p.d.f. as 

h -h 2 x 2 1 


y —— e 

V n 


where hr 


2c 2 



306 


Mathematical Statistics 


h is called the “precision’. As h increases, the normal curve 
becomes narrower and hence h measures in a sense the closeness of 
the bulk of observations to the true value. 

The normal distribution as an Error distribution. Gauss was 
led to the normal distribution by inquiring what law of distribu¬ 
tion errors of observation should obey in order to make the 
arithmetic mean of a set of measurements the most likely value of 

the “true” magnitude. 

Suppose vve take a universe of measurements of some magni¬ 
tude, and consider the universe of deviations from the true value 
Let us also suppose that any deviation is the result of the operation 
of an indefinitely large number of small causes each producing 
small cause, and each producing small perturbations which are 


equally likely. 

Suppose $ is the amount of perturbation. Since positive and 
negative perturbations are equally likely, the expected frequency 

of m positive errors and n—m negative errors in AT observations is 
the term (\) m in the expansion of 7V(| + |)", and the actual 

error is seen to be »>$—(«—m)S=(2m—n)8. Similarly, the e pected 
freqnency of m-f 1 positive errors and n—m— 1 negative errors in 
N observations is the term (i) ,7,+ 1 in 7^(i + i) n , and the 

actual errors is (m-f -1 )<$— (n— m— 1 )8 — 1) — n)8; and so on. 

Proceeding to the limit, as n becomes large, the distribution of 
errors .v about the true value (taken as zero) is given by the law 



1 


( 



exp. (— x~!2o-) 


| » 

Calculation of F(x) - —— 

V l-~) . - 


* - \x 2 , 

e dx 

CO 


for different 


values of a*. 

(/) S / all values of x. 


or 


r . I fO a 2 /2 , , » 

F \ a) = e d K + ~tyti— r~ 


-r -x*/2. 
e Jx 

0 


= L, 1 [ x 

2 J0 


e~^ 2 dx 


F(x)-i = 


1 CX ( A 2 A 1 * X« \ 

Jo V 2 +2*12!J”2»13!) + ** / 

i r a 3 a 5 
V' / ( 2tti [ X ~~ 2 3+2 41 


/a 


5 2'4*6'7 




] 


Special Continuous Distributions 


307 


or 


(ii) Large values of x 

1 f°° —x 2 /2 , 1 

fW =vPol- e 


CO 


V (2-) J-oo 


00 —x 2 /2 , 

e dK 




1 


00 —-x 2 /2 . 

e dx 


1 -F(x) 


v /(2 

1 f 00 

V(2rc) J_^ 

1 


V(2tw) J-oo 

20 x 2 /2 . 

<? </x 


L_ [* 

(2 7T 


x dx (put x 2 —2y) 


1 [*> -y, 

y e dy 


V(2~) V2J.V 

-dy*-'"- t\? r* 1 ** dy \ 




oo 

y 




1 3 ( 



+ 2 *2 J 

=2V^' 

H2 e -V — ly 

^-»+2-V ! 


J l-~ + 

1*3 1 1 

— /.v/ 71 

1 2y + 

2 2 > 2 -J 

1 

-x*i 2 r, 

L . L? 1 

~~ V (2tt) e - 

L 1 - 

x* + x 4 " J 


DO 


y~*i 2 e~ v dy 


} 


Ex. Assume the probability that a deviation lies between 

x and +x is given hy 

2 h f x -h 2 x 2 j 

* (hx)= V^\o ' 

2 t —v 2 

where 4>{y)=^jzr J e (iy 

e~~ y * f , 1 ,1*3 1*3*5 . 1 

Then (i) 1 2> 2_1 '(2/) 2 Uj’ 2 ? + "'J 

[M. Sc. (Agra) 57] 


<u>*W-$r« y [i+«V)+^j 

^Hint. (i) <Hy)=l\\ 0 e y dy ~ 


(2/) 2 + 3.3T 7 (2r) 3 +- ] 

[M. Sc. (Agra) 46, 55] 




Mathematical Statistics 


30* 



8 6 Central Limit Theorem. If a random variable X may 
be represented as a sum of any n indepedent random variables (satis¬ 
fying certain conditions which hold in most applications), then 
this sum, for sufficiently large n, is approximately normally distri¬ 
buted. This remarkable result is known as the Central limit 
theorem. The central limit theorem is one of the most important 
theorems in the statistics, as well as one of the most remarkable 
theorems in the whole of mathematics. 

Central limit theorem. [Lindebtrg-Levy Ihecrem]. Let X u 
X-i,...X n, ... be a sequence of indepedent random variables with 

£LV,) - /»/, and Var (A'.)=a < =, /= 1, 2,... Let X=X> 
-KY„ Then under certain genera* conditions 
X- Z na 

2'n= -“- A'Co, 1) 

V— - * 
i 


That i«, if G n (z) is the cumulative distribution function of 


the random variable Z„, we have G n <z)^ 


1 


z 


— \t z 

- . £ dt 

n-*co V(2tc) 

This theorem is obviously the generalisation of the De Moivre 
Laplace approximation. Here the indepedent random variables Xt 
may possess any k'nd of distribution, but they must have finite 
expectation and finite variance. The individual terms in the sum 
contribute very little to the variation of the sum, and it is very 
unlikely that any single term makes a very larjze contribution to the 
sum errors of measurment have this characteristic The central 



Speical Continuous Distribution 


309 


limit theorem states that the summands need not be normally dis¬ 
tributed in order for the sum to be approximated by a normal 
distribution. 

We discuss below a special and important case of this 

-theorem. 

Theorem. L°t X lt . ,X n be n indepedent random variables 

ail of which have the same distribution with common mean m and 


n 

common variance a 1 . LetS=E X t . Then E\S) = nm and 

<“ 1 


V(S)=n<j2, and 


we have, for large n, that 



S—nm 
ay/ n 


has appraximetely the distribution N( 0, 1) in the sense that 


Lint 1 Cz 

y{z n <. z)=~ 
n-> oc 2r.J _ 


_l/2 

e 2 ' dt=4>(z). 


oo 


Proof. Let M be the common m.g f. of the Xfs. Then 
M,{t)=E(e' l )=E (e (Xl + '" +Xn)t ) 

=E(e Xlt ) E (e X ' J ) ...E (e Xnt ) t since the X t 's are 

independent. 

= 1 Mxi{t)Y. 

The m.g f of Z n is given by 

M- (,)=£(e ,Z ")=£{/ (S_ " m),VW } 

= e —(\Z nm l <, ) t E [ e t IV(* a )S j 


— ( y/nmja)t 




= e 


— (y/n mjo)t 


[ M x,{ v««)] 


Thus log M Zn M= — ~~ < + n log 

[we denote Mxt (f) simply by M(t) for convenience] 

By Maclaurins series : 

M(l)= l+r(0)t+ - ^ * + * 

where R is the remainder term and M\0) -m , A/'fO) =m 2 + o 2 . 




310 


Mathematical Statistics 


Hence log M— (/)=—^^—+ 

a 


n Jogf 


1 + 


mt 


y/n o 


+ 6 S s? tf « ] 


But x= mt R where 1 a: I <1 for large n 

y/ n a 2no* 

Then log M^ (t)=— '^ n J tU +n ^^^^+(^ 8 + g2 ) z-^-r+A 


1 (-~+(m 2 +a-') 
\ V " a 


2 no* 

/2 \2 

--t + ) + -• 

2/lcr- / 


] 


and Lim log M-. {!)=•!=/2 

n-*o o ° Z a 

(Since any term having a positive power of n in the denomin¬ 
ator approaches zero and also all terms involving R apprach zero 
as //-►©o). 

Hence Lira M 7 (r)=e'" /2 ‘ 

00 <r-n z„ 

But this is the m g.f. of a random variable with distribution 
N(0, 1). Because of the uniqueness property of th e/n.g.f. we may 
conclude that the random variable Z n conveiges in distribution 
(as //-> oo) to the distribution iV(0, 1). 

Remark. From the above special form of the central lim t 
theorem we can easily conclude that the arithmetic mean 


— 1 n 

X = — 2 Xi of n observations from the same random variable has, 
n 

for large n, approximately a normal distribution, that is 


-' c- A(0, 1) for large //. 

c/v '• 

Example. Ltt V lf ..., V„ be a number of independent noise 
voltages which are recived in what is called an “ adder". 

Let V =V\-\-V 2 -f- . + V n Suppose that each of the random 
variables Vi is uniformly distributed over the intirval [0, 10J. Com¬ 
pute for // = 20, the probubility that the total incoming voltage 
exiteds 105 ohms. 


Since V t is uniformly distributed over the interval [0, 10]. 

£ j v r \ * V /- ■ « V . w r * l * v ( 1 (ll) 

(F<; 


10 <- , .. (10)- 
2 =5 ohms, Var (F,}= — 


i 2 


According to the central limit theorem, if// is sufficiently large, 
the random variable 


Special Continuous Distributions 


311 


{V ~ 5n \Y— ^mo, i) 


i(V ' 1 


Thus if n = 20 
P(V>\05) = P ( 


V- 100 


> 


105-100 


) 


(10/V1 2)y/20 (10/v/ 12 )v/2U 

<~1-<1> (0 388) = 0 352. 

Note. The distribution of a sum of independent variables 
may not converge to (he normal distribution if the terms do not 

have the same distribution, even if all the random variables hate 
standard deviations. We new Mat e La pur, or theorem (w.m out 
proof) which gives a sufficient condition for a sum of independent 
variables to have a limiting normal distribution. 

Theorem Let Xu bc n ^dependent 

able* whose moments of the third <>rd^r exist. Let 

E{Xi)=mu Var (*,)«a<#0 f E{Xt-m t )*-a t 
and E{ i Xt-nu l 3 ) = ^- Further more, let 


random vari- 




'J (£ f ") j( 

If the relation 


n 


Gi 


') 


Lim 

GC C n 


= 0 


n 


n 


N{ 0, 1) for large n. 


exists , then Zn=£ (Xi—md/Cn - 

1-1 

Ex. 1. State and prove the central limit theorem for inde¬ 
pendent identically distributed random variables with finite vari¬ 
ance. Stale Lapunov’s conditions for the central limit theorem for 
non-identical independent random variables. 

Ex 2. A sequence of independent random variates {A'*} is 

defined by P{X t = !}“/»*. ^**-=0} = ?*= 1 -pt. 

Show that it is Doth necessary and sufficient for the central 
limit theorem to hold for this sequence that the ^ries E Pt q* 

diverges. 

8’7. Itole of the Normal Distribution in Statistics. 

The central limit theorem alone ensures that the normal distri¬ 
ct CSS encountered 

in practice are reasonably close to normal dtsL.bulton to a good 



312 


Mathemat cal Statistics 


degree of approximation. This phenomenon appears quite reason¬ 
able in view of the central limit theorem. 

The normal distribution is favoured for the fact that sampling 
distributions based on a parent normal distribution are fairly 
manageable analytically. When we make inferences about popul¬ 
ations from samples, it is necessary to derive distributions for 
various functions of the sample observations. The mathematical 
problem of obtaining these distributions is often easier for samples 
from a normal population than from any other. The normal 
curve and the normal integral have numerous mathematical pro¬ 
perties which make them attractive and comperatively easy to 
manipulate. 

While applying statistical methods based on the normal distri¬ 
bution, the experimenter must know approximately the general 

form of tiie distribution function which his data follow. If it is 

normal, he can use the methods directly, if it is not, he may trans¬ 
form his data so that the transformed observations follow a nor¬ 
mal distribution. A universe which is skew with respect to a vari¬ 
ate at, for instance, might be normal when we take y/x as the 
variate. 

Ex. Discuss the role of normal distribution in Statistics. 
Explain the significance of the parameters appearing in the distri¬ 
bution ? 

In a certain normal distribution 31% of the members are 
ur.der 45 units and 8 0 / o are over 14 units. Find the mean and S.D. 
of the distribution to nearest Integer. 

8 8. The Exponential Distribution. The p.d.f. of an expo- 
nential distribution with parameter a >0 is given by 

/(.x) = a e~* x . ,y>0 

= 0 elsewhere. 

Here the continuous random variable X assumes only non- 
negative values. 

a oq 

f(x)dx = « l e -^dx=l. 

The exponential distribution plays an important role in 

describing a large class of phenomena, particularly in the area of 

reliability theory (reliability equals the probability that the compo¬ 
nent is still functioning at time / i e. it is defined as R(t) = P(T>t), 
where T is the life length of the component). 


11 

Special Continuous Distributions 

Properties of the Exponential Distribution. 

(i) The c. d.f. of the exponential distribution is given by 

F(A r )=P(^<x)=f a <?—' a’t=l-e~* x , x^O 

Jo 

= 0, elsewhere. 

Thus F(A r >.Y) = l-^(A'<x) = e- 3 ®. 

(ii) The expected value of X is obtained as follows : 

£w= sr m e ~~ jx ~[~ x e ”i + l e ~' ,iix=va - 

Thus the expected value equals the reciprocal of the para¬ 
meter «. Let .-I/p and then /(*) = (!/?) r™. In this form 

£(A") = J x f e ~*“‘ dx= P' 

(iii) For finding variance wc obtain first 


f 30 

E(X 2 ) = J q A a 


,-ax 


dx 


= 4 - [Integrate by parts] 

V(X) = E(X-) - [E(X)Y = 1 /a 2 . 

(iv) The exponential distribution has the following interesting 

property. Consider for any s, t>0, 

P(X>s+t I X>s). We have 

„ v P(X>s±t)_ e~ :i li+t) _ 

F(X>s + t 1 X>s)—- j> ( x>s) e~ ul 

This result is interpreteu as saying that the exponential dis¬ 
tribution has ‘no memory’. The only continuous random variable 

X assuming non-negative values for which 

P(X>s+t I X>s) = P(X>t) for all s, />0. 

is an exponentially distributed random variable. 

It may be noted that the only continuous function G having 
the property that G(x+y)=G(x)G(y) for all x, >>>0 is ^)==e^. 
G(*)wiU satisfy this condition if we define G(x) = \-F(x) where 

F(x)=( a e~ %t dt. 

J 0 

8 9. The Gamma Distribution. 

We define first the Gamma function. The Gamma function, 
denoted by F, is defined as follows : 

j x v ~ x e~ * dx, defined for p>0 



314 


Mathematical Statistics 


in fact, r{p) = (p— \)T(p— 1) and r(n) = (n— 1) ! if p is a 
positive integer, say p =n, and 

ni)=t°° JC’ 1 ' 2 £•-* dx=>V*- 

J o 

With the aid of the Gamma function we can now introduce 
the Gamma probability distribution. 

Definition. A continuous random variable X assuming all 
non-negative values is said to have a Gamma probability distribu¬ 
tion with two parameters, r and a, if its p. d f. is given by 

/•(*) =£(«*)'-» e-«, ,v>0 ...(1) 

= 0, elsewhere. 

where r^l, and a>0. 

r *> 7 r 00 

It is easy to see that j f (a) d< = ~ 1 (^.v) r_1 dx — 1. 

Properties of the Gamma Distribution. 

(i) If r — I, f (x) in (1) be omes * e" #t . Hence the expo¬ 
nential distribution is a special case of the Gamma distribution. 

(ii) In most of our applications, the parameter r will be a 
positive integer. In this case we find an interesting relationship 
between the c.d /. of the Gamma distribution and the Poisson 
distribution. 


Consider the integral /— 


integer and a>0. 


Now /= 


oo 


or 




e~ y y r dy 



where r is a positive 


=e a a r -brj^ e ,J y r 1 dy (Integrating by parts) 


that is, r ! I = e~ a a r -hr.{{/-— i) ! } / 

Tin's is a recurrence relation, hence 

(r- 1 ) ! I=e~ a a r -'+{r-\).{{r- 2 ) !}/ 
(' = ~) ! I = e~ a a r (r — 2) . {(r — 3) !}/ 

r ! 1 = e~ 0 [a r -\-ra r ~ lJ rr(r- I) -f r !] 

Hence f=c~ a f l+cr + v,+ ... + 




Special Continuous Distributions 


315 


r e -a a* r 


=2 Vr = S P(Y=k). 
o k ! *-o 

where Thas a Poisson distribution with parameter a. 

The c.d.f. of the random variable whose p.d.f. is given by (1) 


above is 


F{x) = x) -1 -P(X>x) 


00 


1-1 —-—rfaj)"- 1 e-°' ds, x> 0 

1 , (r—Ij ! 


But us=u and we obtain 


F(x) = 1 - 


OO u r-l e -« 


</m, *>0. 


thus 


j.* (r-O'- . 

This integral is precisely of the form 1 above (with a - a*) and 


/’(*) = l - X 

i-o 


r-l e -7* ( y.X i 


it ! 


, x>0 


Hence the c.d f of the Gamma distribution may be expressed 
in terms of the tabulated c.d.f. of the Poisson distribution. 

(we recall that this is valid if the parameter r is a positive intcs, >• 

(iii) If A' has a Gamma distribution given by equation (1) 
we have 

E(X )-i. W--3- 

For, 


a 


a 


oo 




-L (- o ) r “ 1 e_!/ ( P ut 1 *® xA 

fr Jo « 


1 f(r+l)_ = r 

i: r[r) 


E{X ‘)=7Vj\* xIWxr ' e ~’ , ‘ 1X 


T ' r 1 * (proceed as above) 


Thus (X ) 

= '• | '+ 1) - r ‘ ~ r 
~ a 


a 


2 


(,v) The m g !■ of > he Gamma variate with parameters * and 
r is given by 



316 


Mathematical Statistics 


Mx(t) 


-w! 


oo 


0 

JO 


e tx (ca) r-1 e~* m dx 

.r-1 -*{«-/> 


<Zx. 


(This converges when a>f) 


Let x{a. — t) = u. thus dx 
and we obtain. 


du 


(a-0 



Remark. If r=l, the Gamma function becomes the exponen¬ 
tial distribution; for with r=l. 

f(x)= *e-' x , x>0 
= 0, otherwise 

If a=l, the Gamma function becomes 

fix)=j ~77 Ar_1 e * >0 

—-0 , otherwise 

This is th c p.d.f of a Gamma variate with single parameter 
r, which we shall denote by y (r) variate. 

For a=l, the mg/ of the Gamma variate with parameter r 
becomes 


Jl/x(0=(l-r)-'=l + r/+ r ^^r=+..., 


that is /*!' = r, /V = r(r+ 1) 

Hence the comulant generating function is 
CA(0 = log Mx(t) = — r log (1— t) 

t 2 




[ 


<+ 2 -+;-+ 




The pth cumulant is therefore 
k v =r(p- 1) l=rr( P ) 

(Reproductive property of the Gamma Distribution) 

Suppose now that X and Y are independent Gamma variates 
with parameters r and / respectively. Then 


Mx+yU) ~ Mx(t)\fy{t) 

= (l — rr ( ' + *> 


Special Continuous distributions 


317 


But this is the mg/of a y'r-f/) variate. Hence the sum of two 
independent Gamma variates , with parameters r and /, is a Gamma 
variate with parameter r-f-/. 

Theorem. If X is a normal variate with mean m and s.dc , then 
- is a Gamma variate with parameter \ 

2 a z 


we have 

dF = 


1 _ e -iU*-m)K) dXf _ oc<x<K 


CT V U 77 ) 

Let u=>$(x—m) : jG 2 

so that as x goes from — oc to -+ co, u goes trom oc to 0 and ba:k 
again from 0 to cc. 

-M ..1/2-% 

Thus dF= 2. —=- 7 — ^ 0<m < cc 


[ 


Observe dF= - 


* —e-W x - m)lar, dx. 0<.v<ocl 

') J 


Gy/(lr.) 
e' u 


so that u is a y($) variate. 


</// 


Note. If we take t/=—, where v 


)' * 


the square of 


the standard normal variate, then 

c/2 y l/2-l 

JF= ^-rm dt ' n<, ' <0 ° 

e -*/2 V l/2-l 

/.e. 7(v) =' 21 - /ari/r 0<v<». 

This is the pdf of a /.- distribution with one degree of freedom 
(see below) Thus we find that the square of a random variable with 
distribution N(Q, 1) has a X , 2 -distribution 

Example. Let X x , X 2 , .X, be independent and identically 

distributed rando n variables, each having an exponential distribution 
with the same parameter a L‘t Z—-X x + X z f ...+AV: Then Z has 
a Gamma distr buti-n with parameters a and r. 


00 


We have M^ (t) = \ e 


o 

a 


tx — ar f 

o.e dx= al 


oo 


x(t-o-) 


, /<a 


a-/ 


, t> a. 


Hence Mz[t)-M v UW Y ((/) 



318 


Mathematical Statistics 


-(*r 

which is the mgf of the Gamma distribution with parameters a 
and r. 

Ex. Suppose the continuous random variable X has distri¬ 
bution N(0, 1), then Y=X 2 has Gamma distribution with density 


function 


f O’) = 


y- l/ 2 ^-(!/ 2 )V 
y/r. 


for 0 


= 0, else where. 

8 10 The Chi-square Distribution. Let us now consider the 
special case of the Gamma distribution in which a = £ and r=n/ 2, 
where n is a positive integer. We obtain a one-parameter family 
of distributions with pdf 



1 

2 "i-fn 


(/i/2)— 1 



A random variable X having the above pdf is said to have a 
chi-square distribution with n degrees of freedom (denoted by X„ 2 ). 
In the figure, the pdf for «=!, 2 and n> 2 is shown. 




Thus for a X» a variate 

E(X) = n, V(X) = 2n. 

The chi-square distribution has many important applications 
in statistical inference. Because of its importance, the chi-square 


Special Continuous Distributions 


319 


distribution is tabulated for various values of the parameter n. 
Thus we may find in the table 
that value, denote by X 2 <*, satis¬ 
fying P (A'<X« 2 ) = a, 0<a< 1. 

(See Fig). 

We may look '/. 2 distri¬ 
bution from another angle 
which is rather important in 
statistics. Wc have seen in the theorem above that it .v< is a N(m it 

of) variate, then £ ^ ? * j is a y{\) variate and ^ j is 

X, 2 -variate. Thus because of the reproductive property of the 
Gamma variates we have the following result. 





If Xi , r=l, 2 ,., n are independent, normally distributed 

variables, with common means zero and with variances <? t ~, and 
72—2 xSJop, then \72 is a gamma variate with parameter n/2. 

The p'tf of a X- variite with n degrees of freedom is given 

. ..A * ^ 


/(*/- 2 n) = 


(XT ( "" 2,/2 


so that 


2"/*|/»/2 
= 0 otherwise. 

£(X 2 n ) = /i, Var (X- (l )^=2 n 


0 < X 2 < oc 


and A/XV/)= [yHztX 2 = (X ~ 2t) ~ n ' 2 

If 72 n and X*«, are two independent chi-squcre variates 


then 


= (,- 20 - (1/2)( " +n,) 


\ * - 9 

But this is the mgf of a X„ +t „ variate Thus, if /2 n and X 2 „, are 
independent X 2 variate with n and m degrees of freedom, then 72 „ 
-fX 2 m fa a X 2 var/a re w/f/i (n+m)d.f. We have the theorems : 


Theorem. Suppose that the distribution of X, is 72m, i = I, 2, 
where the X t 's are ind pendent random uuriables. Let 
Z=x x 4-...+ Xk ■ Then Z has d stribution 72where n=n i +. 

Theorem. Suppose that X u ... X% are independent random 
variables , eae/i /iav/«g distribution N( 0, 1). 7%en S=Xf + Xf-\- ... 
-f A' 2 * /ra.s distribution X'*. 

Ex. Suppose that X lt .. X» arc independent random variables 





320 


Mathematical Statistics 


each with distribution N( 0, 1). Find the pdf of T defined by 

T=y/(X i*+AV+. X\) 

Clearly T 2 has distribution X 2 ». 

Hence F(t) = P(T<. t) = P(T 2 < / 2 ) 

f 2 1 njl —1 — xj2 


- 


2 n / 2 r(/i/2) 

2t n/2— 1 — 1 2 /2 

Hence f(t) = F'(t)= ^ e 


dx 


—t s /2 
2t n ~ 1 e 1 


2"/*r(n/2> 


if r ^ 0 


Note, -a) If n=2. f{t)=te 1 ^. The distribution of 

is known as Rayleigh distribution. 

(b) Jf n=3. the distribution of v / (^i 2_ b-*V ! 'b-^ r s 2 ) 
is known as Maxwell’s distribution and its pdf is 


2/ 2 e ' _/2 2/ 2 « ' 


2 


, t > 0. 


2 s I*r3j2 V(2n) 


Ex. Suppose X u Xu X 3 denote a random sample of size three 
from a distribution N( 0, 1). Let Y=X 1 2 -\-X z 2 -i-X 3 2 . Find directly 
the pdf af Y. 

Hint. F(F)= j || ^ ( ■^ 1 *" ^d.xidxodx% 

R 

where R={x u *a. *a. I *i 2 4-* a 2 -f V I < y} 

Change to spherical coordinates, that is 

*i = r cos 6 sin <t>, x 2 — r sin 0 sin <f>, x 3 = rcos<t> 

r > 0, 0 < 0 < 2-, 0 < <f> < t:. Then, for>» > 0 

J^ e ~'‘ l2 r IaiD + d * dBdr 


where 


= JL t '/y re r ~ 12 dr [Put r=y/u) 

V 71 Jo 


du, v < 0 


Since T is a random variable of the continuous type, the pdf 
ofTis /(>) = F\y): Thus 

3/2-1 -yl 2 


F O0» 


1 


/(2t t)’ 


, 0 < y < oc 


=- 0 , 


elsewhere. 



Special Continuous Distributions 


321 


or 


f(y)~ 


l 


2*i*r3/2 

that is, Y is a X 2 3 variate) 


^a/a-ig 


—y/2 


o 


Gemetrical Proof of the X 2 distribution. Let X u X it . X n be 

n independent normal variates each with distribution N(0, 1). We 
are interested in finding the probability distribution unction of 


X 2 =S 


The joint distribution of independent standard normal variates 
is given by 


-h X x , 2 


dP~ 


1 


(2 n )"' 2 


dx i dx$ •» dx, 


^ - 

where £ x ( a =X a 

I 

The volume enclosed by the hypersphere S *<*=*' is propor¬ 
tional to X"; and the element of volume between this and the 
adjacent hypersphere of radius X+dX is proportional to diX ), 
that is to X" -1 dX n_1 [l e. element of volume dx x dx$ ..dx n trans- 

foims into X"” 1 dx. 


Hence 


or 


dP oc e iX ’x--‘ dx 

d Pc<(iX') i{n ~ 2 ) e~ iX ‘d(m 


Since 0 < X* < oc, that the integral of the pdf over the 
whole range must be unity the constant of proportionality must be 


1 


f(/>/2) ‘ 

Thus dP= 


r(n/2) 


(}X*) i( " _2) e IX’) 



322 


Mathematical Statistic? 


or 


dP= 


! 


2*' 2 Hal2) 


e - Z '(Z 2 ) " 2/2 o < V- < co 


Properties of the Z 2 distribution 

fi) The moments of the Z 2 „ distribution are given by 


P'r^Eli Z\) 9 )- - 


I 


Put Z. 2 = n and obt.iin 


2*1* fnf 


> ir </?Ye 


' _ 9 r Q” /2 + r) 

f(n/2) 

moments about the origin is 
^'r-(n 4-2r — 2)/t' r -i 

(iii) Since C (f) = log .4/ _(/) = log (1 — 

The rth cumulant is 

k r = 2*fSr) ,/'2-2'-i (r-i) f/i 

(iv) Since £(Z 2 n > = «, TlZ 2 fl ) = 2// 

— o)/ N 2n is reduced i~ — variate. 

■v) Z* distribution fends to normal distribution as the number 
of degree of freedom becomes large. 

,U vS (,) “ £ exp f =exp (vT^)) £ { exp fc§o)} 


= e < 7 ^^(c 4 ,) 

=-(,- w )(-^)- n/2 


■'/(/) = 


f Since A/ y2 (/)~( I—2t)-"' 2 J 

~T , op ('~ v 7 ^)) 

TL7(37,7 +i (v7^) 2 + -]- ir 

-(i) 

Hence as /»->cr, log A/(t)->i / 2 or A/(/)-*et/*» 


\/{2n) 

— tu 


xA2 n) 


2 1 


V(2ny 


> I 


/2 


Special Continuous Distributions 


323 


But this is the mgf of the standard normal variate, 
* n = N(0, 1) for large n. 


Thus 

V(2 n) 

(vi) For large values of n, y/ (2X 2 „) 
N(V2n-l), 1) 

This is known as Fisher's approximation. 

Lim 


approximates to 


n 


OO 


P{V(2Xn 2 )-v / (2^-l)<x} 


Lim 


n 


OO 


P{7. 2 <M* + \/(2n-l)] 2 } 


_Lim p\ ^~ n < 

n-*oo 


Lim 

CO 


|^2vW, 

V(2n) "* \ - v /( 2 tt ) )_«, 


t V (2n) ) \/ (2‘ 


Jz-<D(x) 


Example 1. For /ivo degrees of freedom , r/ie probability P 
of a value of X 2 greater than X 0 3 is exp ( —£ V)* am * /fence that 
X 0 2 = 2 logo (1 /F). Deduce the value of X 0 2 vv/re/i P=>*05. 


Now P=P(/?>/.<?)=\ f~ 2 exp (-i Z«) rf/. 3 (here n=2) 

=exp (-* V) 

Hence logP=-£X. 2 or X 0 2 = 2 log (1/P), 
when P= 05. we get 

7. 0 2 =2 loga (20) =2 logo 10 x logi 0 20 

= 2 (2*3026) (1'3010) = 3 012. 


Examples 2. The recursion relation for the central moments 
of the X 2 distribution with n d f is 

Pr+ i=»2r (Pr+nPr-i). 

Now 2r(fir+M^ r _ 1 )=2r[£ , (M-n) r +n£(w-n) r - 1 ] 

= 2r£ , [(i/-n) r + nfw-«) r ' 1 J (where u=X n 2 ) 

' —2r£[(tf-#i) r ~ 1 wj 


2 r 


1 


2»'*r(n/2) 

2 r f0 ° 


! oo 

(u-n)'' 1 u.u 

0 


(n/2)— 1 —(w/2) 


*/w 


2 n/2 f(n/2) Jo 


f (w—n) f-x M n/2 e- n/2 </« 

J 0 



324 


Mathematical Statistic s 


_ 2r 


2 *l*r,nU) 

•°°(M 






OO 


.*/3 






Exerciser 


T- x is a rectangular variat* with the distribution 
/>(*)=!, 0<*<I 
«=^0, otherwise. 

Show ihat —2 log .v follows V* distribution with 2 d.f. 
Suggest a praotical use of this result. 

(Hint. dF—dx. 0<.x< 1 



du, where u=— 2 log x 
u/2 du =►/» = $ e—/*, i/>0. 


If a sample of size n is drawn from the rectangular distri¬ 
bution with values .r,, x t ...., x n% then because of reproductive pro¬ 
perty — 2 log P has X 2 distribution with 2 n d.f. where P=x l x i 

*» • -> «]• 


2 The prob density function of X 2 with n d.f. is given by 
p(X i )—k.e~ZJ 2 (0<X 2 <oc). Find A: obtain the corres¬ 

ponding m e f Show that for large values of n t the standardized 
'/? tends to be normally distributed. 

3. X 2 is a chi-square variate with n d f Show that (i) the 
mean and variance of X 3 are respectively n and 2 n. (ii) the stand- 

y 2 j 

ardised variable - , ~ f { approaches the standardised norma) as 

v(2 n) 

u-*oc. Stare ihe practical use of this result. 

4. F< r X 2 with even d f v, prove that 


Special Continuous Dislributions 


325 


where p=$ X/and r=| (v—2) 

jHint. /= ' r dt=±f3' e-*+ 

Continuing this process 

/ -.-,|, +p+ £ i+ ... + £]} 


L (<•-!) 


~i ,r~l 


dt 


5. Show that if v is even 


P= 


1 


2 ,w - 2, ^r(p/2) 


s: 


-VH v„-i 


X* _I d/. 


-*V 2 r, ,X* , , 

[ ,+ 2 + TT + - + 2 


X p - 2 _1 

.4.6...(v— 2)J 


enrd hence the values of P for a given X 2 can be derived from tables 
of Poisson’s exponential limit. 

00 *, 






/ v -3 etc 


Let 7 
but m—X 2 J2 and then 
P=e~ m ^ 


e ^ 2 X- + (v-2^ 


m 2 . m* 


ii ......... . 1 

1 -j-w -f- — -f- _ ... -f- ,- 

2! 3J t ’ {(v —2)/2}lJ 


(v —2)/2 


x = 0 



6 . Write a brief explanatory note on transformation of vari¬ 
ates. Show that if X is an exponential variate, then e~ x is a rect¬ 
angular variate. 

Also prove that the sum of a number of independent identical 
exponential variates is a chi-square variate and deduce the additive 
property of chi-square variates. |M.A. Agra 67] 

7. The variables x t , x u ,.. t x n are independently distributed 
in the rectanguler form : 

dP=dx, 0< *<1 . 

Then, if P=XiX 2 x i ...x nt show that —2 logo V has X 2 distri¬ 
bution with 2 n D F. JB.Sc. Krk. 70) 

j^Hint. —2 log P= £* ( —2 log x <) 

Let w<= — 2 log x it that is x t =e Ut 

Now dF= dx, 0 <x<l. 


326 


Mathematical Statistics 


dF= 


= l 0 ^ U< «<<co 


1 


— \ui . 1 . 

(ifi) du { 


2 D.F. 


2*/* r(2/2) 

follows the probability law of X 2 distribution 


with 


From the additive property of the X 2 distribution, it follows 


that —2 log P=£ lU has X 2 distribution with 2r. 



811. Beta distribution of the first kind. 

is defined by the integral 


The Beta function 


D{m 


, n) = r -x"'- 1 (l—-Y)"' 1 dx, 

J 0 


which converges for m>0, n>0. 

1 

, we get 


If we substitute x— 


B{m t n)=J 


1 +v 

X 


n-1 


U+T) 


m+n 


dv 


The Beta function is symmetrical in m and //, that is. 

B{tn> n)=B(n, m). 

The Beta and Gamma functions are connected by the relation 

r(m)r(n) [For proof, see author’s text Book 


( B[m , n) 


r\m+n ) [ of Integral calculus Vol MJ 
Thep<//of the Beta variate of the first kind is given by 


yj-lf I _ 

,rW= B(/, m) ~ 0<Jt< 1 ...(|) 

It has two parameters / and rn. Such a variate is denoted by 
§i{l, m) variate. 


The mean of f3i (/, m) variate is given by 

Em =( 1 

J 0 


/ 


Em) =f 1 

Jo 


1 -**(I x) m ~* B(/+ 1, m)_ 

£(/, ivi) £(/, m) = /H m * 

_g(/4-2.w) /(/-hi) 


B(l, m) 


B(l, m) (t + rn)(l-i- mf - 1 } 



Special Continuous Distributions 


327 


Hence V(X) = E{ X*)-E\X) = 


Im 


t* t* / W*t* U 

Theorem. Let the random variables X and Y be independent 
and have Gamma distribution with parameters l and m respectively. 

X 

Then the variate y_^y has a Beta distribution of the first kind 
with parameters I anjnt. 

Since the random variables are independent, their joint distri¬ 
bution is 

1 ... 

e~"y 


dF=.~ 


rU)F(m) 

1 -<x+y) 


e~ m A«" , .e”V' , “ 1 dxdy 


r(l)T(m) 

Let us make the transformation 

so that x = uv, y^=u( i — 


x*' 1 y m 1 dxdy, 0<x< oc ,0<^< oc . 


Then as x and y range from 0 to oo, u ranges front 0 to cc 
and v from 0 to 1. Also 

,dx 6a. 

3( x, y) jjdu dt 
0(m, v) I \dy dy 
I0w dv 



v u 


l — F - U 

s=: 

U 


Thus dF= 


1 


l (l)l (m) 


-U+J’) vl -, 


x* J y 


in "I 


>’) 
d(u, v) 


du dv. 


0<n<oc, 0<v<l 


e -u u l+m-2 


ududv 


r (l)F(m) 

e~ u u li ‘ n, ~ 1 du \t~ l (l — v)"‘~ i dv 


F(l-\-m) F(l, m) 

Hence the variates u and v are independent, and that u is a 
y(l-\-m) variate and vaft (/, rn) variate. 

Ex 1. If X & Y ate independently distributed like X m 2 and V. n 2 , 

X 

find the density of ~y~^y • 



328 


Mathemetlcal Statistics 


Ex 2. If X and Y are independently distributed like /?» (/, m) 
and y(/-bra). find the density of XY. 

[Hint. The probability differential for x is 

*i -1 ( I —x) m_1 dx 


dp-- 

Let Z—xy, 


B(l, m) 


, 0<x<l 


Hence, for a fixed value of y in the internal dy, the probabi¬ 
lity that 2 will fall in the interval dz is 

paa B(/,w) \y J v y) y 

Multiply this by the probability that a random value of y will 
fall in the interval dy, and integrate over the range of y from z to 
oc (since z < y). Thus the probability that, for any value ofy, z 
will be in the interval dz , is 


dP 




*(/, w)r (/+m) 

z^ 1 dz '°° 


ir 


—y l+m—\ 


I 2 \ w_l dl^ 

\ 7 / y' 


e~ v ( y-z) 


e^z'-'dz f 00 — zt 

raj rim) 


m-1 dy 

\ 


z m i m ~ 1 dt [puty-r = z/J 


e •z l ~ 1 dz 
e ~ 1 z l ~ l dz 

m 


f 00 

J e~ u u M ~ 1 dt fput zt = u 1 


Consequently z is a y (/) variate.] 

8 12. Beta distribution of the Second kind. 

variate x having the following probability density. 

1 x* 1 


A continuous 


/(-v) 


x ^ 0, l,m > 0 


(1) 


(I +n) ,+m ’ 

= 0, elsewhare 

is said to have the Beta distribution of the second kind with para 
meters / and m and is denoted by ./?* (/, m) variate. 

Properties of fit (/, ni) distribution. 

(i) /'(*)=0 =► x *» — — J j that is, there is a mode at 




Special Continuous Distributions 


329 




7-1 
/M+ i 


provided / > 1 


(ii) [°° w-1 )_/ 

£(7,m)Jo (1+*.**"' £(/, m) /»- 

The mean exists when m > 1. 


(iii) £(jr!) =«brl 


X> 


5(7+2, m-2) 


0 (I+atJ**'" 


B(l, ni) 


/ ( 7+1 




Thus when m > 2, 

V{X)*E{X*)-E*[X) = 


/f/Tm-1) 
(m— |) s (//i —2p 


(iv) If a: is a & (/, /«) variate, then its reciprocal I lx is also 
a (/, m) variate. This may easily verified by putting x = \/y j n 

(i) above 


(v) To each Beta variate of the first kind , there correspond 
two Beta variates of the second kind. 

This is easily seen if we put x= 1/(1 +>) in (1) above, then y is 
a m) variate and by the reciprocal property of the & variate. 

(vi) If Y and Y are independently distributed like y(l) a.,d 
Y(m), then X/Y is a /? 2 (7, m) variate. 

The joint probability distribution of X, Y is given by 


dp 


e~*x l ~ l e~ y y 


f/l-I 


dxdy, 0 < x cc t 0 < y <. oc 


['(/) r(m) 

Put z = x/y, keeping y fixed, so that dx=ydz. 

e-"(yz)i-i , e -vyn-i 

Jp ~-Tur ydz 71 w dy 

N w for finding the probability difTerential of r, integrate 
out^ in the range 0 < y < oc. Thus 


I oo ,>j f-m -1 p -VU+») 

} —^-4— 


dv 


0 rU)T{m) 

Put y(\+z)=u y keeping 2 fixed. Thus 


dP= 


z l ~h/z 
V (tj'(m) 


\ X (~) 

JO \l+*)/ 


\7 + /M —I du 


(I +Z) 


330 


Mathematical Statistics 


or 



z l ~ l dz.r(l-\-m) 
TU)T^n) (l+z)i + "’ 


0 ^ z < cc 


2 1 " 1 ds 

Consequently Z i.e. X>Y is a (3 2 (/, /n) variate. 

Remark This result may be directly proved by the trans¬ 
formation U—X+Y, V=X/X 

Ex. 1. If ,Y and Y are independent Gamma variates with 
parameters / and m respectively, show that the variates 

U=X 4 V=XjY 

are independent and that U is a y (l + m) variate, and V is a 
p» (/, m) variate. 

Also show that Xf(X+ Y) is a &(/, m) variate. 

[M. A. (Slat) Delhi 58) 


Ex. 2. Find the distribution of the quotient of two indepen¬ 
dent standard normal variate. (The resulting distribution is 
called Cauchy's distribution], 

(Hint. Let Z=X/Y t where X and Y each have distribution 

A'<0, 1). 


Then Z 2 =+Y 2 /$T 2 . \X* and \Y Z both are Gamma variates 
with parameter Hence Z~ is the ratio of two independent Gam¬ 
ma variates with parameter Theretore Z- is a £ 2 (£, \) variate 
with probabiiity differential 

, ua*" 1 */(*•) 




Bi\, *) (i+r)* 


d(z-) 


*(1+2*) \/2 a 
h-d(z') 


0 < z 2 < oc 

— OO < Z < CC 


“* 11 +Z 2 )v / 2* ’ 

Since the range of z is from — co to + cc while that of z 2 is 
from 0 to +oc and, that is why, the factor | is introduced. 

—oo ^ Z ^ 00 ] 


Hence 


Special Continuous Distributions 


331 


Ex. 3. Show that, if X and Y are independent normal variates 
with m;ans nt l% m 2 and variances of, of respectly. the quotient 

7- X-m x 

Y-nu 

conforms to the di tribution 



with a range — oo to -f oo. 

IM. A Madras 1960, M. A. Statistics Delhi 60, 65, 

M. A Agra 67, 1. C A. R. Delhi 47, J 


[Hint. 

a z =. 

[X-mx \i 

i 

L 

Oo 

lA !\ 

1 \ °2 /J 


<7,Z 2 1 1 / X — in , 1 j v ~ VI 

^ " = t 2 \ o, )i2\ o.r )( 

{S p 2 (i» variate. 


Consequently the probability differential ofofz*/of is 

l ofz'lofyi'-'dlofz'/of) 


dP= 


that is, 


ufi, M i+«i *z*fo*) 

d(ofz^lof) 

7.{\-\-0fZ‘l0.f,(0 l ZlG t ) 

\d (ofz' i /a z ) 


, 0 < z <. cc 


~\~°fzl/of)(o 1 z/o i ) 


T, — » < z < OC 


<//>--- 


(of/af )' zdz 


o l o i dz 


n(if ofz^lofRo^/o t )' r.{of-\- a fz 2 ) 


Ex 4. X is distributed as a normal random variable with 
mean 0 and variance I and Y is independently distributed as a X 2 
with n degrees of freedom. Find the distribution of t=»\/nX/y/Y. 
Find V(t) and obtain the approximate distribution ol t for large 

values of n [M. A. Agra 62J 

[Ans. V(t)=nJ(n-2) ;/ (/) = exp (-/ 2 /2)], 

8 13. Distribution of the sum and Ratio of two independent 
chi-square variates. 



332 


Mathematical Statistics 


Theorem. IfX 2 and X 2 2 are independent X s variates with n% 
and n 3 degrees of freedom respectively , then 

(i) '/?—/ j a + X 2 2 is a X* variate with (//j-f-w,) </./ and 

(ii) T’^isap, ^ j variate. 

Proof. This theorem can be easily derived from previous 
theorems, if we recall, that \V. X 2 is a y(n x /2) variate and }X a a is a 
*(«*/-') variate and hence 


P-r + iXj 2 is a y ^ ^ N ~ j variate 


Also 


r2 _V = iX, a _ ,^\ 

57’ P ' U' T / vana,e - 


But here we prove the theorem directly, as it is interesting. 

The joint probability distribution of X, 2 and x t * is 

1 


c/P- 


2* (n a /2, (X ‘‘> 

-i «i*+V) 


<"./2)-l (V) (n,/2)-i 


so that 


e - * - ' d(X 2 ) J(X a >), 

0<X a *^x 

Use the transformation 

Xj=X cos 0, X 2 =X sin 0 

a (x,. x a ) 


..(i) 


3 


x de dt 


Hence 


JP 


(Z ‘ !+Z *V 


til. 


2 i ("i + ".> rM2) r W) 


-=c<X 1 <x, -X) ^<00 


is transformed into 


dP 


4 (/ cos fl/ ?1 1 (X sin fl)” 3 1 o z </X ^ 




HV?) r (« a /2) 


0<X<30, 


Special Continuous Distributions 


333 


2X 


n ' +n *~ l e-WjX 2 cos" 1 — 1 ff sin”’ — * ** 




(X 


mi ( w i + w s —2) — £X 2 . 7i, —| n-t — 2 

•2_ g VAX') 2 cos 1 0 sin 3 /? ,/0 


5(/?i/2, nj.) 

0<X<oc, O^0<^/2 
0 sin 


2 i <".+«*) r((n>+ „ 2 , /2} 


B( n \!2, n a /2) 


...( 2 ) 


Thus it follows that 

ti) the variates X 2 and 0 are independently distributed, 
(ii) X 2 =X 1 2 + X 2 2 is distributed like X 2 with (w,-f// 8 ) J /. 
The probability differential for 0 is 



cos” 1 \ sin” 2 X 0 dO 

B («i/2, *7/2) * 


O<0<*/2 


In this make the transformation, cot 0=T 
so that —cosec 2 0 dd=dT 



Thus dp— 


2 r" 1 X dT 


BlniH, « 2 /2)(I+T 2 ) 

cr «C»./2)-l 


{(", + «. )/2» 


o<r<» 


rflD 1 


£(n,/2, n 2 l 2) (1 + T 2 ) 


(Wi+n 2 )/2 


0< 7^<oo 


Hence T 2 is a p 2 (wj/2, « t /2) variate. 

Ex. 1. S'/row that //X , 2 o«r/ X** ore independently distributed 
as X 2 jy/M n x and n 2 d.f respectively t then 

z= x?Tv" a fi ' (t’t)"- 

[ X 2 1 _ T* 

Hlnt * z= XV+X 3 2 = 1 Tx 2 2 /x?~T+T 2 

or put cos 2 0=Z in eqn. (3) abovej. 

Ex. 2. If the independent variates X and Y are distributed 

X 

like X 2 with «! and n 2 d. f. respectively , Men 0 Pi(fli/2, n 2 /2) 



334 


Mathematical Statistics 


variate , while X/Y is a ,3 2 (nJ2 t n 2 l2) variate . 

jVlint. Take M-^J^y, v=A r +>'J. 

814 The Log-normal distribution. 


Definition. If Log X is normally dislributed with mean m 
and variance then the p.d.f. of X is given by 

/ . < . V Q ^ 

x—m 


/(V ’=,T7W exp K C 22 ^-") }• 


0<.V^oo 


' \' \ *• ' \ - \ ^ r 

This is known as the log-n >rmal distribution with parameters 

m and v. 


It is easy to verify that 
P / — exp (nt + $j 2 ), 

/i r '=exp (mr-\-\s~r-) 


fi./—exp ( 2 ni+ 2 s 2 ), 
V(X )= /*,'* (e S -1) 


Ex. 1 4 variate X\ has the density function 

f (a)- exp {“* ( ,0 S *W» v >° 

•v y ( 4-77 | 

F/«</ £<X) <w«/ E(A'). 


8 15 The Bivariate Normal Distribution. All the continuou ? 

r m lom variables discussed so far have been one-dimensional. In 
statistics, higher dimensional random variables plav an importan 
role in describing experimental outcomes. One of *he mos 
important continnous two dimensional distribution is the Bivariat 
normal distribution It is the direct generalization of the one 
dimensional normal distribution. A pair of continuous rando: i 
variables (X, V4 assuming all values in the Euclidean plane has a 
bivariate normal distribution if its joint p.d.f. is given by 


/(v, v) = ~- 


1 


2-<T,<7.. \/(] 
(X 


- 2 , 


—0O<A<°O, — O0<^<00 


...(I) 


The above p.d f depends on five parameters (p lt Pp, o u <j 2 ; f . 
The following restrictions are placed on the parameters. 


Special Continuous Distributions 


335 


—oo <[A 3 < oo ; — oo <Ma<°° ; <r 4 >0 ; <r 2 >0 ; — 1 <p<l. 

For f to define a legitimate pdf 

f (x, y)> 0, f [ / (x, y) c lx dy= 1. 

J -DO J - DC 

Obviously,/(. y,>’)>X We show that /(jc, j) integrates to 1 
over the real plane. 

f oo r oo 

j | f(x,y)dxdy 

ii-- {^, [(“)■ 

Making the substitution u =■ —^- l , v=' —— 2 , we obtain 


1 


oo 


2* V(l-P 2 ) J-oo 

We know that 


f* 5 f ir— 2Pi/v +v-^ , , 

J.,, exp r 27r—p 2 j J ^ ^ - (3) 


1 oo exp [~ 2 ct- M ] dx= V( 2n ) a - (41 

To integrate first with respect to v in (3), we write the inte¬ 
grand as a function of v in the form found in (4). Thus we find 
that (3) is equal to 


1 f~ /_ 1 a 2 , 1 PV \ 

- Vd —P 2 ) J-00 exp \ 2 l-p‘ + 2 1-fd • 


r 

J -oo 


exp 


r_i (v— p//) 2 i 

L 2 1- P 2 J 


dv du 


=2771^7)L “P (-*-•) V(2-)V(i-rt 


du 


V(2n) 


i: 


X -W A , 

e r/u=I. 

OO 


Properties of the Bivariate normal distribution. 

It will now be shown that if ( X, Y) has p.d.f. as given by (I) 
(a) the marginal distributions of X and of Y are N(p u af) 
and N(ih, * 2 2 ) respectively. 



336 


Mathematical Statistics 


(b) p is the correlation coefficient of X and Y. 

(c) the conditional distributions of X (given that Y=y) and 
of Y given that (Af=.r) are respectively. 

Mmi + pK/S) O'— /**). ct i 3 0-p 2 )]» p (<V a i) (*— i*i)» 

<*« 2 ( 1 - P 2 )] 


(a) We have 




l!(l 

Rearranging and integrating as above we obtain 


g (*) 


v'i 2")^ ; 


— e 


— ocO<oc ...(5) 


Thus A' is seen to be AT (M lt ^i a ). 

In like manner, we see that Y is N (/* a , <V). 

(c) Second, we find the conditional density for y given x. 


We have h (y 1 x)= 


/(*. y) 

g( x) 

1 


exp 




V(2n) (t 8 v (*-p*) 

- * mm>' (^)ll 

i _ r -i_ 

v'(2tt) * a V(»-P 2 ) CXP U (1-P*) 


1 


y—/*.—p (<7 9 / 0 i) (*— 




which is the normal density function with parameters (^i+P CT a/^i 
(A' n,), <W0-P 8 ) Thus the conditional distribution for .p is 
norm d, is centred of the point /i>4-P °%l < s\ (* — /*,), and has s.r/. 
a 3 V(l~ P 2 ) vn hich is less than the marginal value a 2 unless p = 0. 
The line r=A t 2 -f p (x- /*j) passes through the centers of these 


conditional distributions. It is called the regression line of y on x. 
The graph of x=*E(x I y ) is called the regression curve of X on Y 
and the graph of y=E(Y, x) the regression curve of Y on X. The 
right hand side of equation (M is the conditional pdf of y, given 
X=x. Thus, with a bivariate normal distribution, the conditional 


Special Continuous Distributions 


337 


mean of Y, given X=x, is linear in X and is given by 

E(Y | x) = /* 2 +p <x a / CT i f* — ^ i)* 

(b) Since the coefficient of X in E(Y I x) is P o t fo u and since 
cti and o 2 represent the respective standard deviations, the number 
<7 is in fact the correlation cofficient of AT and Y. 

In a similar manner, we can show that the conditional distri¬ 
bution of AT, given Y='\, is the norma! distribution 

atJV.+p^- O’-/^Wd-P*)] 


Remark, (i) The converse of (a) is not true. It is possible 
to have a joint pdf which is not bivariate normal and yet the mar¬ 
ginal pdfs ofX and of Y are one-dimensional normal. 

(ii) We observe from Eq. (1) that if p = 0. the pdf of (AT, Y) 
mav be factored and hence X and Y are independent. Thus in case 
of a bivariate normal distribution, zero corralation and indepen- 

dance are equivalent. 


(iii) We find that hoth regression functions of the mean 
are linear we also observe that the variance of the conditional distri 
bution is reduced in the same proportion as (1 -P 2 ). This is. if P 
is close to zero, the conditional variance is essentially the same as 
the unconditional variance, while if p is close to -fcl, the conditio¬ 
nal variance is close to zero. 

Some more properties of the bivariate normal pdf 

(i) It is'easily seen that quadratic expression. 

is always negative or zero, the latter occuring onlv if 

(x. y)=(A*.. /**). 

Tt follows that the density f (x , y) has its f maximum value of 

(flU Also as (x, y) moves away from (#* lf /i 2 ), the density rapidly 
approaches zero. 

(ii) Consider the surface z=f(x, y) t where/is the bivariate 

normal pdf given by Eq: fl). 

2 =const, cuts the surface in an ellipse, that is 


i 


\ a 


/x-uA/v-Mo N / 


\* 


x—Uf 


V — 



33g 

Mathematical Statistics 

density 65 ' S ° me ' ImeS CaBerf con,ours of constant probability 

If p=0 and ctjsjoj, the above ellipse becomes a circle. As 
P-> ± 1 the above ellipse degenerates into a straight line. 

On) Moment generating function of (he bivariate normal 
distribution. 


M (/,, /,) 


■r 

f 00 'ix+tzy 

J -00 

J-co 

r 


J -00 

L J -00 


/(*, y) dxdy 


°° tty 

e h (v | x) dy 


[Recall h (y\ x) = £ L x -y) l 

for all values of r, and r 2 . The integral within the bracket is the 
m 8‘f °f the conditional p. d. f. h (y | x ). 

Since h(y | x) is A^. + p (*„/„,) ^ 

= ex p{' 2 [t* a + P K/c,1 

Accordingly, A/(r„ r s ) can he written in the form 
exp p°i , 1 + 0!V^| 

Loo e * P f('* 4 '» p ^ ) •* ] fW dir 

we hat 6 £(f " ) =^ , ' 4V ' V2 f° r ■i’ being a o,*) variate, 

A/ (^i. '*)=exp p ff# - ri -{-( 1 ~P 2 j 

l ff i 2 

+ *•1 (/■ + '» P o g lj|. ^' + '»PW) > | 
or, equ'valently 1 ^ * 

(tu / 2 ) = exp ^ / c 1 / 1 -f-u^ j -|-^l v ’ 24 ' 2 P ,T i ga^i^2+ga 2 /2 2 \ 

If p = 0, then 

M U u t 2 ) = M (/,) ( /a ) 

ever ?-n5 and y are S,OChas,ica,| y -'"dependent when P = 0. How- 
ver, P—0 does not in general imply that two variables are sto- 


Special Continuous Distributions 

chastically independent. This can be seen from the joint p. d f. 

i / * 2 ~yl\ 

* exp l 2TTT7) 2 ) 

in which Cov (X, Y) is zero, and yet X and Y are not independent. 

Exercises 


1. For the density function 

f(x,y) = i -l<x..r< + l 

= 0 otherwise. 

find the probability within the circle x 2 +y 2 =r 2 . Find the distri- 
bution and density functions for r 2 . 

2. For the distribution with density function 

/ (x,y) = 6 e ~ 2x ~ 3y x,y >0 

otherwise 

evaluate F(x , y), *(>), h(y), gix \ y ), h(y 1 x). 

3 For the bivariate normal distribution 
f \x, y) = C exp [-S-T (x-7)0>-5) + 4 (y —5) 2 }] 

determine C, the parameters and the regression lines. 

4. Explain the ideas of marginal distributions, conditional 

distributions and the curves of regression, with respect to a 
bivariate distribution. Show how to find them from a given 
bivariate frequency density function/(x, y). 

If dF(x,y)=Ux+3y) e-*-» dxdy, x^O, y* 0 obtain the 

equation of the curve of regression of y on x. 

If dF(x,y)=k exp [~ 2 P ~) U 2 ~ 2 ?xy+y*)] dx dy 

I l 

— oo<x, y<oc, find the distribution of (x 2 —2 ?xy +y 2 ) / ^ 


5 If (X t Y) has the bivariate normal distribution (/*i, P* 
Oi o 2 , ?), show that the random variable 

-’if' Pr-v 1 ) 

has the bivariate normal distribution (0, 0 ; 1,1 ; 0) ; that is, u 
and v are independent and have the standardized normal distri¬ 
bution. 

6. Show if one of the regression curves is constant, then 

Cov ( X, Y) = pxr=0. 



340 


Mat he matical Stalls tics 


[Hint. Assume y=E(Y\ x ) = C Then 

C= j_ oo y h (y | x) dy = j°° 


y ~~ dv 
y (x) - 


- %-• \ f 

C 8 W=f” yf(x,y)dy 

J — CO 

! go 

-=o sW,fi ' =c= L Loo fody^ElY) 

XI C=£(K). 

Now Cov (.v, Y) = E[{X-E(X)) (Y-E(Y ))J 

= £ (XY)-E(X) E (Y) 


l-oo { „ v -'' f < v ’ ■>’) * <6>-f£ (AO 

“J 1 xg(xi 11 ' r ^r * w 

— f 00 

"J-cc (x > dx E (y I x)-cE (X )=0 J 


7. Use Ihe distribution 

»• ■““”™ a.™.. 

* , 5 =rrwssr 5 jrc —• - 

«T'b) exp {~ r (’V 7 )'} *• - « <v< =o 

while the probability of « i self lying within the range do is 

<V exp (~2 "o7“) aJa - 0<a<oo 

Ihl? I s “ C,nS ' i,nt - Show «■»' «"« unconditional (/e. ~ 
,ri utl0n of X has the following probability function 

I I I _ I \ 


2°E" P { —"/"'l' -“<*<* 

s a ,Ks,: IMSI 


°° < .y < cc 


-- .1 i V/I UiV 

/ (v, ,)~L exp J 


fAns. P= = J_ ] 
1 \/2 J 



Special Continuous Distributions 


341 


10. The exponent of a bivariate normal density function is 
_2 n*-10) 2 _(x—10) (>»- 15) (jc — 15) 2 1 
3 L 4 6 + 9 J- 

Find out the parameters fi u g,, g 2 and p, if they exist. 

8 16. The Multivariate Normal Distribution. The p.d.f. 
for the bivariate normal distribution can be written in the form 
on using matrix notation 

f(x,y = t" ex P l—l (X-/*) A (X-n)'] 

I Xm 4 • f 


where X—/*=[*— x—P t ] —a row vector 
(X—#*X=— —a co * umn vector a 

-P 


A = 


■_I_ 

(1-p 2 ) or 

-p 


oia 2 (. - py 

», • • 

1 


= h CTl2 1 =V _1 (say) 

L0 2 i 0 2 2 J 


LoiGa (1-P 2 ) ( I —P*J ©a* J 
in which <Ji 2 =F (x) = E(x—iy) 2 , 2 =E(y-p.)\ a ia=o 2 ,= 

£ /x 2 )} = Cov (x, ;>)• I ^ | is the determinant of the 


square matrix: A. V is called the covariance matrix. Note p 
A A:-variate normal density function is given by 

rtx , jtfj-ie? 

J (xi,x 2t .. t x k ) e 


g 12 

°i°a 


where X-/^ = [x 1 —/x x , x 2 —(X-/i)' is the transpose 
oiX-ti. 

A=V -1 where Fisa square matrix of order k and whose (//V* 
element is the covariance between the stochastic variables and 
Xf. The diagonal elements of V are the variances of X u X u ...X k 
and the non-diagonal elements are the various covariances. 
(X —/*) A (X—/*)' is assumed to lie a positive definite quadratic 
form oi A is assumed to be positive definite. 

817 . Pearson’s Distributions. Karl Pearson considered the 

differential equation 

dy_ y ( m-x ) 
dx u-\-bx + cx z 


for obtaining a general system of frequency curves. We get a wide 
variety of distribution functions as solutions of differential equa¬ 
tions obtained by giving different values to the constants m, a, b , c 
in (I). We get /-shaped and (/-shaped curves, symmetrical and 
skewed curves, distributions with finite and infinite tanges. 



342 


Mathematical Statistics 


Multiplying (1) by x n and integrating over the range — «> to 
oo t we have. 

1 ^(ax tl +bx n+l +cx* +i ']dx= J y (mx n —x n¥l )dx. 
Integrating by parts the L. H. S. we get. 

(ax n +6x n+, -f-c.x n+2 ) j — j y[anx n ~ l + b{n+\) x n 

+ r (/»+2)x" +2 ] dx. 

Coo 

— \ y(mx n — x^ 1 ) dx. 


n-s: 


-i: 


Assnming that the first term vanishes at cither limit, 
anH’n-i + b (n t-1) tm’+c (#:-}-2)fi„' + i = t*n+Pn+\ .0) 

If x it the deviation from the mean, we get 

arifin-i -T b(n 4-1 )fi r , + c(n4‘2)/ i i» + i = —mPn “b/^w+i .(2) 

Putting n = o, 1,2, 3, we get 
b — — m , a+3r^ 8 =/t 2 , 3/>/ x 2 + ^ r P3=~ w P2 + Pa 
3a^ 2 -f 4 b\i z -f 5 c^ x = — + /* 4 

Solving then, we get. 

a= 

2 (5 Pt-bfo-9) 
b _ + 3) 

2(3{ij — —9 

r _ 2,g 3 -3/?,-6 
2(5&-6/* 1 -9 

where A* 2 = o 2 , = 

• - 

The solutions of (1) depends on the nature of the roots of the 
equation a-\-bx-\-cx 2 —o. Its discriminant is b 2 — Aac. We define a 

h- 

quantity & = —^whose values will determine the nature of partic' 

ular distributions 
Remark: 

Putting n o in (1) we get 

b-\-2cni= — m+ Mi' 

, /« + A 

or '*■ - r=w 

Putting n=l in (I) we get 

<2-f 26/V + 3r/i 2 '= —m/ij'-f 

, aT(wi4-2/>)/*,' 

or ,< a =-- 


or fi, = 




Special Continuous Distributions 


343 


If now t= -—— 1 ’then for the new standardized t 

a 

i*i*=O t fx 2 / =/i a =(orCT=l), E (r)=fi' n =^ 

Hence b=—m x a=H —3c and equation (1) becomes 
(1 — 3c)«^ n -i- mnnn-\-{c{n + 2) - 1} +i=0 
living n the values 2 and 3, we get 

2m + (1—4c) — ° 

3(1—3 ) —3m/x s —(1 — 5c) /x 4 =0 


Whence 


2m 


6 (m* — 4r 2 +c) 
y 2 =/»4- 3 - (4c — 1) (5c— I) 

These relation give the skewnes and cxccw <?/ kurtosis of any 
of the Pearsan curves for which the fourth moment exist,. These 
T elations may be expressed in more convement form. 

25J-C- 2(14 26) 
where 8= ( 2 ^ 2 —3y i 2 )/(y2 + /,; ) 

Typei. ^7 <0 ' c,<x<c * 


IS 


where c x and c* are the roots of a+6x + cx 2 =0, that 
c 1 c 2 =a/c=a 2 /ac<0 Thus the roots are real and of opporite signs 

The differential equations is. 


1 dy _ x—'m 

y dx c (x—Cj) {c 2 -x t 


nii _ 

X —Cy c 2 — x 


m—Ci _ c„—m , 

■where » " ,2_ c Ka-ca) 

both m,. m 2 >0, since c< 0 ^ 

Integrating. <x—c,) 1 (c 2 —x) 

•Change the variable x by so that 1 -u 


c 2 —x 
Ca Cy 


Hence y=Bu™ 1 (1—w) 

Therefore « is a ft (mj+1* »»* +1) variate. 

T yP evi . -^> 1 . e ‘**< 00 

•where c x is the larger root of a + bx+cx 2 =0 

These conditions imply that the roots are real and have the 
*ame sign. 



344 


Mathematical Statistics 


The differential equation is 
1 dy m-x 


m 


m 2 


y dx c(x~ci) (x—c 2 ) x—C! x-c 2 
where c a <c,<x, 

„ . Mo-ca)’ 2 c^Ci-Cz) 

Both wi, m 2 >0, since e>0. Hence 

y=A ( x- Cl ) mi (x-c 2 )~” h 

Putting w = ———, 1 ~hu=--we get 

c i ~c 2 c x -c 2 

y=Bu m ‘(l+ u r m \ 0<H<CC 

so that u is a (3 2 (//;, -I- 1 , m 2 — ni l -f-1) variate. 

Type IV. 0<b-/(4ac)<\ y _cc<.v<oc. 

In this case the roots are imaginary. 

We ha\e I ^ = -JS- X . 

y dx a+bx+cx- 

The mode is evidently at x=m. 

Transfer the origin to the mode and get 

** ( ' 0g y)= a+h (X+„n i-V~(T+7ny lpu< 

d _ y 

r (log>') =--- 

dx h ' Bo-t- B x X-\- B 2 X- 

= __ -X 

U / B 2 4B.y{ 

_ X — a -f-a 

5|W+a)'-rJ : }' Say ' 

integrating 

log _)' = log k-~ lag ;(.r+«)=+/3-'} + i , 

2 


or 


or 


y=yo l(.Y+a)' ! +/J 2 ] XI{2B exp 
This can he put in the form 


r^_ 


1 , _,Y+ct 

J ta " ‘"T 


tan 


?] 




/ — V tan -1 (x/ni) 


, 1, ''> 0 , — co <.x < 


CO 


Type III. 


4ar 


= 0, r = 0, e 1 ^.v< 


OO 


: h :;: + :\77 l,K j hese restric,io, ' s ^p'yo ne ro 0t D f 

a+4>.\+r.\‘ i =0 is infinite. 


Special Continuous Distributions 


345 


The differential eqn. is 


dy_ m—x 

l 

l 

3 

1 

dx b (x—ci) 

6 /> (x— Ci) 



= «!+ v „ , 

say 

X—C! 


n —m x x 
y=C e (x- 


u=m 1 (x—c,), we get 


y=B u 


m 2 —u 


0<«<oo 


when 


B= 


1 


f(m a -t- 1) 

Thus u is a y (m a + l) variate. 

g 

Note that for a type III carve C«= 0=» . 77 - 7 - ■ - = 0 

Z V l -pz 0 ) 


s=o. 


Therefore for this curve y x and y 2 are connected by the relation 

2y. l — 3v 2 x 

or i (P 2 —3) = 3 Pi 

or 2 p 2 = 3 (( 3 i + 2 ) 

Type VII. '^7 = °- c >°* -co<^<°c. 

Because of these restrictions the differential equation is 
1 dy m—x 

— = -— , — CO<X< X) 

y dx c ( x* + d-) 
where d 2 = alc 

If the variable is standardized, 

m = b- 0 and*/ 2 — ——3 (recall for standardized variable) 

b~ — m, fl=l—5c] 

Wc obtain on integration, calling the standardized variable u, 

\_ 

2c 


Jog >’=-^7 ,() g u 2 +d 2 )+A 

(d 2 -f 3) 12 


or y=B(u'+d 2 ) 

This is a symmetrical curve, asymptotic to the * axis in both 

directions. 

Putting u=di (d : -f-2)" J/2 , x=J 2 -\- 2, we find that the disri- 
bution of t is given by 

/«-* (.^)- (n+1)/2 

which is student's t-distribution. 


, — oc<r<oo 



346 


Mathematical Statistics 


Remark. The normal distribution may be obtained as a 
solution of the differentia! equation for b=c —0 and a>0 For 

1 dy_m — x 

7 Tx —-°°<*<°° 

Integrating 

1 


or 


where 


Jog y=~— (/w — x) 2 -f const. 

— l/(2a) (a — m) 2 

y = C.e 


— CC <_Y < 00 


c= 


I 


Thus x is a normal variate, 


y/(2na,' 

Type V. Roots of a-f-b\:-}-cx J =0 are equal so that b 2 /(4ac) =1 
Here we take — =i f c a <x<oc 


Aac 


where 


41 2c 


The differential ebnation is 

1 dv m — x 


m 


or 


where 


y dx e(A-t-d/UcJF c(.v-oa) 1 

J dy = /;; i _ //;= 

yw/.v (.v-c ir .v-c, 


= 


m — c, 
c 


1 

/// 2 = — 
c 


Integration gives y=A e~ — 
Tutting //—x — Ci we get 
—m 1 !u — m 8 


.v 5 e 
where 5 = 


n 


, 0 <.m<oo. /?i 1( m a > 0 . 


I U'"r - IJj 

Type II y?oo/i of the eqnation tf-f A.v + cx 2 =0 are eowa/ but 
of opposite signs so that (b 2 /4ac)—0 

Transferring the origin to the mode, the differential equaion is 

A \r 


dx (l ° s y) #0+17x77 a 


— [see type IV] 


rile above condition implies that 5=0 and B 2 and 5„ are of 


oppoiste signs. 

He,1Ce V.v ( ,0 ST) = 


—-V 


log.v=— —log + ^ T const. 


Special Continuous Distributions 


347 


This can be put in the form. 
y=)’o (1 -x 2 /p 2 )\~P< x ^P‘ 

where p—y/(—BjBi) ... 

Cor. If g = 0, then y= const, ard the distribution is 

rectangular. . . 

Ex. Derive normal and rectanguler distributions as particular 

cases of Pearson's system of frequency curves. 

[M. A. (Delhi) 54,’ 59) 

8-18 The Gram. Charlicr series. There is another general system 

of distribution functions, known so the Gram charlier series. It is 

based upon the normal distributiou and its derivatives. This system 

is composed of an infinite series of a certain kind of which the 

first corresponds to the normal law. We assume that the variable 

has been standardized, that is, we consider r=-(a-p, )/«• 

_ l* 1 2 

Repeated differentiation of the function e yields 


df 

di 

d l 

dt 


^” /2/2 ) ==(,2 “ 1) * 
i(e" /2/2 ) = — « 3 — 30 « 


-z 2 /2 


-r-/2 


-m 


d l(- r ~ l2 ) = ( • !)"//„ (/) <? 

d/"V / 

where IU (Z) is the nth Hermite polynomial 

« («-!) ,„- 2 r 

H„[t)=r -—2 2 -* 4 

By repeated integration by parts it can be easily shown that 

roo ~l*/2 . fr/ ! if m=w _ n 

-i— f JI m (z) Hn (Z) e dt ~[ 0 ifm=« ( ) 

\/ ( 2tc) J - co 

Let us assume that a given frequency function can be expanded 
in a series. . . . 

f (t)=o 0 <l> (/) W W + -+^ C " , (z) + . (2) 

1 ~z 2 /2 

where * 

The constants in the series (2) can be formally obtained by 
means of ( 1 ). Multiplying (2) by H. <0 and integrating term 

by term we have. 





348 


Mathematical Statistics 


t oo 

U {t) H " (n * = f(') (D <//=.(-!). „ I a . 

Since all terms in the sum except that for which 

zero on integration. Substituting // 0 = I H =t H _ /2 7 8 

// 3 =, ;~ 3 ^-' 4 -^+3, 8 we°obtkim ’ 3 ~~' ' 

°o = f f(t)dt = I 

J “OO 

° 1- <//-0 since/*/(0 = 0 


o 3 = 


i-OO 1,2 0/(0 *=0, since p a ' (t)=n„ (r) = j 

"sM-" t'*-*)/w *— 3 I * 


"ri 

^ Aj-3 _ Va 
24 24’ 

Therefore 


/(')-?! ( 0 -£- (S' 3 ' ( 0 +^ *<«> 

This is the Grum-Charlier A .'cries. 

*' IV Ilerinile Polynomials. 

T “* > 1 -X-/2 


since ^ 3 ' (/) =/X3 (,) 


Tet <f>(x) = —-— e 

v U~) 


Then 


d- , 

tlx- M*)(* 2 — 1) <i>(x) 
d* # 

dx ,<Hx) = (3x-x») ^(.v) = -(.v 3 _ 3 . y) ^ 


' 7 = (-!)" //.{.*) (S(.v) 
where //„ ( v) i s a polynomial in .v, of degree « 

termite polynomial. 9 


called the 


/. 1 i 


Speciai Continuous Distributions 


349 


We define therefore Hermite polynomial by the equation 
(-l) r jp<Kx) = H, (x) <f>(x) 

or (—D) r <f>'x) = H r (x)<f>(x). 

Evidently H/x) is of degree r in x and the coefficient of x r is 
unity. 

By convention H 0 = 1 we have 

. / x 1 —l(x — t) 2 ... tx — t 2 /2 

4> (x-t)=-j— e =Mx) e 

y/U*) 

and also, by Taylor’s theorem 

4 4"M~. + ( ~ n , « + ••• 

Z ! • I • 


or e 


tx-t 2 /2 


•n it 


<f)(x) = Z ’-r-. Hj (>)<Kx) 
;-o J ! 


t r tx — t 2 /2 

Corsequently H r ix) is the coefficient of in e . It 

r 


follows that 

Hr (x) = x r - ‘ y a 


x r "*+ 


r fr-1) (r — 2) (r-3 ) r _ 4 
2“.2 ! 


x 

r (C) 


2 3 .3 ! 


\ r-fl 

• • • 


where r (c >=r (r-1) (r—2).. (r-5). 
Some Properties. 


(a) The second order differential equation satisfied by H r (x). 
Differentiating the identity 


tx-t 2 /2 =r “ tijhix) 

j-o j ! 


wtth respect to x we have 


t e fx ~ ,2 ,2 - 27 t W (x) 


j-D J 


...d) 


or 


oo ,j h txt 00 

—-- = S T1 H/(x) 


j ! 


i-o j ! 


Whence equating coefficients in / r we have 




1 



350 


Mathematical Statistics 


or 


— Hr (*)=r H r _ x (x) 


Differentiating this once more 


Hr M=r ^ Hr-i (x) = r (#•-1) (x) 

or Z)= H, (x) = r m H,. 2 (*) ; D=d/dx, /•<*> = /• (r 

and generally 

D'Hr (X)=r"> Hr-,(x). 

Differentiating (I) with respect to t we have 

e ' 77T7T. 


- 1 ) 


or 


or 


or 


(A) 


j-i <7-0 ! 


00 00 />~i 

Identifying coefficients in / r_1 we get 

H '" (x) ~ ( 7 = 27 ! *»-» ( *>=(tWi Hr w 

// r (x) .v // r _, (x)-f(r- I) // r _ 2 (x) = 0 

This together with (A) gives 

H r (x) — — ^ H r (r) 4- — // r (at) =;> 

r dx r dx 2 

y"-.vy'-f-ry = 0 i fH,(x)=y. 

fb) The Hermite polynomials possess an orthogonal property. 
Consider 

//,„ (x) Hn (a) <f> (-x) dx-]* H m (x) ( — l) n D n <f> (x) dx 


-00 


OO 


= (-l] n \hMD*- x 4>[x) 1** +(-l)»-> \ M ~ Hm'. X) 

J-co 


-OO 


</.v 


Z) n ” 1 </>(*) 


= m (-l) n " 1 


yo 


-CO 


//„,_,(.y) /> 1 <f){x) dx. 


since the term in square brackets vanishes. 

Continuing this process, we find either zero, if mj£n t or m 
if m — n. 


Exercises 

1. Show that, when the constants are suitably selected, tl 
solution of ^ ^ leads to the Z 2 distribution with 2 

degrees of freedom where c~k = c(a-\-c) — b. 

[M.A. (Stat.) Delhi 65, I.C A.R. Delhi 5 


Special Continuous Distributions 


351 


t 


Hint. *_■'*+"=* * 

y c c- x + b/c 


y=A e*l* 


( -4) ( “ 


- *)/c 2 


Changing the constants, 

7=/I e-e* (x+a)*- 1 , /3>0. A:>0 

where c s k—c' K a+c) — b ; the range being - oo<x<oo 


and 


- 8 * 


A = 


P 


Hence 


Uk) ! * 

'"TftTf — 


00 


where a=0 and /?=£, this distribution is known as 7. 2 distribution 
with 2k d. J ]. 

2 (a) Show that for a Pearson distribution 

df(r) = (x + a) f ( x) 

bo+biX+btX 2 ' 


dx 


the characteristic function >f> obeys the relation 


M^+ (1 +2fc 3 + A,(?) ^ + (fl + A,■4 MW 


d<f> 


= 0 


where 9=it 


(b) Hence deduce the recurrence relat on for moments. 

[M A Madras 1950] 

((b) Hint. Differentiate the given relation n times with respect 


d n <f> 


6=0 


= P n \ Transferring 


to 6 and then put 0 = 0 Remember — n 
the origin to the mean, we get 

[(/i+2)6a+l] Phi + [("+») + t*n+nb 0 p »-1 = 0] 

3 Obtain Pearson Type Ill curve from the basic differential 
equation of the Pearsonian system of curves and prove that for 
this curve 20 2 = 3 (?, + 2) (M A Bombay 58, M A. (Stat) Delhi 69] 

4. ////, (x) is a Hermitc polynomial defined by 

' 1 ’ D= dx 

Pr0Ve "‘ a ' ,»> ,«> M r«> 

(0 ffrW 2jrT) *' ,+ 2» (2 !j * 2 J (3 !) + " 



352 


Mathematical Statistics 


(H) D*Hr(x)-xDff r (x)+rH r (x )=*0 [M.A Bombay 54) 

(Hi) f Hr(t) e~* tZ dt= — H r _i(x) e~^ x ~ 

J-00 

[M.A. (Stat.) Delhi 65] 

5. Discuss the differential equation for the Pearsonian 

system of curves and find for which values of the constants we get 
the following X 2 distribution having n. d.f. 

dF= const, e (X 2 )*" 1 d (X 2 ), 0<X 2 <oo 

[M.A. Patna, 1957] 


9 

REGRESSION AND CORRELATION; 

CURVE FITTING 

9*1. In this section we shall discuss problems of dependence- 
or correlation , as it is often called-in the case of a two dimensional 
random variable. 

We note the moments of a bivariate distribution. 

/*'r,=T7- 2 f xfySy N=>2 f 

iV i i 

In particular 

p'lo—pj- ^ f Xi=x, ^,' ol =— 2fyi—y 

where x is the mean value of x in the distribution, and y the 
mean value of y. Similarly 

zf, (x,-x)*+r- 

= Ox 2 + X* 

/‘'n-jj- ffy,*=°S+r 

where a* 2 is the variance of the variable x in the distribution, 
and likewise a, 2 is the variance of y. 

Lastly p’n=]f Zfaiy* t notc (*F)] 

and hi=TT 2/ (*i *-*)(*-?H note n u =E{(X-x)(Y-?)} 

N i 

li lt is called the covariance of the variables. Some authors use the 

symbol o xv in place of /*u. 

We can alternatively write as 

fti=^- Sf,x,y,-x 2 fy,-y 2f t x,+x? 



354 


Mathematical Statistics 


*=p'n-xy 

or Cov (X, Y)=E(XY)-E(X) E(Y) ...(1) 

In fact Cov (X, y)=£l(A r -Jf)(y-r)}. 

The formula (1) is very important, and will he used frequ- 
ently. 

9*2. Lines of regression 

Suppose we have a record of N marriages and the ages of the 
couples are indicated by the variables x and y. Then to eac 
couple corresponds a pair of values (.**, yi) of the variables. Eavh 
pair ( xi , yt) may be represented by a point Pi in the x, y plane. 
Such a graphical representation is called a Scatter diagram. 
Possibly the pair of values may not be all different. Suppose the 
pair ( Xf , )•<) ojcurs with frequency. Then 

n 

2Jf = N 

i-i 

n being the number of different pairs of values. The assemblage 

of pairs of values, together with their frequencies, constitutes a 

bivariate frequency distribution. To consider a more general 

case, let us suppose that in a data the value xi is associated with 

y } with frequency//; where 1 = 1, 2, 3,..., n, j= 1, 2, 3,..., m and 

n m . 

Zfu=mj, Zfi~n u 2 Sf,=N. In this case the bivariate 

j-i j t 

frequency table can be shown as below. 


\ 

\ 

y\x 

\ 

X 

Xi 

X* 

A 3 

.. x { 

• • • 

x„ 

Total of 
rows 

y\ 

f\ i 

Ax 

/3I 

fil 


Ax 

m x 

}’ Z 

fit 

J22 

f 2 

f*2 


fn a 

m t 

) 3 
• 

JlZ 

ft 3 

fz 3 

fa 


Jnz 

m 3 

• 

• 

yj 

• 

Aj 

/./ 

At 

f, 


At 

mj 

• 

• 

flm 

fiV% 

Jam 

f fm 


fvm 

m m 

total 








of col¬ 

Hi 

"2 


m 


Tin 

N 

umns 









Regression and Correlation; Curve Fitting 


355 


It frequently happens that the scatter diagram indicates an 
association between the variables, x and y, for the distribution of 
dots may be denser in the neighbourhood of a certain curve which 
may be called a curve of regression. The equation of such a curve 
indicates a functional relationship between the variables x and y. 
For the present we determine two straight lines, one of which 
gives the closest estimate a straight line can give to the average 
value of y for each specified values of x, while the other gives the 

corresponding estimate of x for a given value of y. In other words 
we have to fit two straight lines to the data 

X{ t yi=~— E fijyj, !•= 1 * 2 

”t J 

and Xj =— HfjXi, yj (7=1, 2,...m ) 

mj t 

These are called the lines of regression of y on x, and of x on y 
respectively. 

Let us first consider the line of regression of y on x , In this 
case we have to determine the constants a and b so that the 
equation 

y=a-\-bx •••( ) 

for a given bivariate data gives for each value of x, the best 
estimate a linear equation can give for the average value of y. We 
interpret the term ‘best estimate’ in accordance with the principle 
of least squares. According to the principle of least squares we 
have to find a and b such that the sum of squares of the devia- 



tions, that is, Z f, (y,-a-bxiY is minimum. The point P, 

/“I 

represents the pair of values ( x it y { ) and H, the pair of values 
( X{ , a+bxi) with the same abscissa x t . The deviation of Pi from 
its estimate H t is H,Pi and/, is the frequency of the pair of values 
( x { , yi). Hence let 



356 


Mathematical Statistics 


U=2 f (yi—a—bXi) 2 
1-1 

dU dU 

For U to be minimum, r— =0 and —=0. 

£a db 

These lead to the normal equations 
Zfi (yi—a — bxi)=0 
SfXi (yi—a—bxi) =0 
These are equivalent to 




(3> 

(4) 


■f f fiVi-a Ji Sf—b f Z/,*,=0 or y 


y—a—bx —0 


(5) 


and i- Ef Xl y,-a f -S/ix,-* i- Z/,x,»=0 or y.'^-ax-by'^O 

Now since h # h=H n+*y, ^ , 2 o=°^ 2 +^ a ( see section 91), this 
can be written in the Term 

Hn +xy—ax—b (a x 2 -|-* 2 )=0 •••(6) 

From (5) it is evident that the mean (x, y) of the distribution^ 
lies on the line of regression of y on x. Substituting a=y—bx 
in (6) we find 

Mu 


Thus 


lead to 
or 


Hu— £o x a =0, that is, b = 

ymaa+bx 
y=a+bx 
y-y=b (x-x) 

y-y^i (x-x). 

Gx 


...(7) 


This is the required line of regression of y on x. 

If *, J' is taken as the origin, then we can write (7) as 


v=^ a: 

The gradient Hu/o* 2 is called the coefficient oj regrestion of 
y on .v. This is sometimes denoted by b vx . 

By interchanging tne variables x and y in (7), we find that 
the line of regression of x on y is 

...(3) 


x —x=^l (y-y) 

Gy‘ 


and Hu /g v - is the coellicient of regression of x on y (denoted by bxv ) 
The coefficient of correlation, r is defined by 


Regression and Correlation , Curve Fitting 


357 




ajJ 




as l a xO« 


...(9) 

that is, r is the geometric mean of the coefficients of regression. 
Since the arithmetic mean of two quantities is always greater 
than the geometric mean. 


*( 


Ox- O.r 


>r. 


As oxo v is always positive, the sign of r in (9) is the same as 
that of the covariance /x lle 

The angle of inclination /3 between (7) and (8) is given by 

“ >-a-m '■'■a • a) 

[The formula tan 6 is used 1 

l 1 J 


or 


tan 3= 


Qy , Qx i 

Mu 
1 —r 2 


( 


1 




Ox 2 Oy- 

OxOti 


)/K 


+ Oy 2 ) 


r a^+a,, 2 


..( 10 ) 


If r=±l, £=0, that is, the two lines of regression coincide 

and the sum of squares of deviations from either line of regression 

is zero. In this case there is a linear functional relation between 

the variables x and y. The nearer r 2 is to unity, the closer are the 

points to the lines of regression, and the nearer arc these two 
lines to coincidenca. 


If r=0, £=7t/2, that is, the two lines of regression are per¬ 
pendicular and the variables x and y are uncorrelated. Thus the 
magnitude of r is the measure of the degree to which the association 
between x and y approaches a linear functional relationship, r is 
positive when, on the whole, y increases with x, and negative when 
y decreases as x increases. Some authors use the symbol p* y for 
the coefficient of correlation r. Thus we may write 

Cov ( x , v) _Or v 

r ~ 9xv - Var (x) Var ( y )} <r*a. 

For calculation perposes the following for r is more convenient. 

F(x-x' (y-y) 

r “ vt £(*-*) 2 £ ( y - y )'} 

E(xy)—E(x) E(y) _ 

f= V(£(* :4 )-£W} «£(>'*)-£*(>’)} 


358 


Mathematical Statistics 


_ ...( 11 ) 

The correlation coefficient is a dimensionless quantity. 

If the variables are independent, then E(xy)=*E{x) E(y) and 
the coefficient of correlation is zero. But zero correlation does 
not necessarily imply independence between the variables. This is 
evident from the following example. 

x:-3 -2 -1 0 1 2 3 

y= x i : 2 4 10 14 9 

xy: -27 -3 -1 0 1 8 27 

Here r.x=0=27 xy -> E(x)=0=E (xv) 

r=0 

But .x and y are connected by the relation y=x 2 . 

Regression curves in terms of conditional expectations. 

We recall th; notations 



f 00 /(X, >) dy, where/ (x, y) is tne joint proba- 
J- CO 


bility density of ( X , Y). 

h oo= f °° y y) dx 

J oo 


, , , f(x,y) . (v . 
g(x 1 ^“TTvT ’ h(y * g(x) 


The function of y 

OO 

E (X 1 y)= f -v g (a I y) dx 

OO 

is called the conditional expectation of Xfor giving , and the func¬ 
tion of x 

f °° 

£(T 1 *)= J ? h (y I x) dy 

— oo 

the conditional expectation of Yfor given x. 

The graph of E(X I y) as a function of y 

x=E(X\y) 

is called the regression curve of X on Y, and the graph of E(Y | x) 

y=E(Y\x) 

is called the regression curve of Y on X. 


Regression and Correlation ; Curve Fitting 


359 


Theorem. For any random variable (A', Y) with finite ay 
and ay we have 


°XY I ^ a X a Y 


oo oo 


For 


Ary 


=jf f [x-E (X))-[y-EiY))f(x, >') dx dy}> 

J -00 -OO 

oo oo 

< J j [x-E(X)Yf(x t y) dx dy 


-00 - OO 

OO 


X 


| \[y-E(Y)ff(x,y)dxdy 

J —DO 


(on using Schwarz’s inequality \ E (XY) \ ^ [E(x-). E^Y*)] 1 ^ 
=° 2 X ° Z Y 


and hence 


a XY 


< o x oy 


Ex. 1. Show that for tha joint probability density function 
f{x , y)=(lM. over X'+Y* < 1 

= 0, otherwise 

the coefficient of correlation is zero. 

(It is again an example of the fact that from r =*0 it can not 
be concluded that X and Y are independent) 

f V(l— x 2 ) 2 rV(l-x 2 ) 

We have g (x)= J _ V) J, ** 

7r 


0 


Hence 


Also h {y) 


=| +1 xg (x) dx=~ ^ x \/(l-x 2 ) dx=0 

-i 


+il. 2 

■ — dx —■ ■ • 

-1 71 7t 


therefore 


y= 


h'n 


rVO-x') 2 

y.—dy=0 

)-y/(\-X*) n 

/*+! rVO x a ) 

xyf(x t y) dx dy 

J-i J -V 0 -* 2 ) 



360 


Mathematical Statistics 


r+1 r (VO— * ) y 1 

= * — dx^= 0 

J -1 w J 

Therefore t i ll =n' ll z=xy=0 

Consequently r xy => 0. But X and Y are connected by the equa¬ 
tion of the circle. 

Ex. 2. Show ihat 

r= (° 2 x 4- a 2 y— a% x— y )/*°x a y 

T-et X-Y=Z 

so that X—Y=Z 

Hence (X- X) - ( Y- Y) <=Z- ~Z 

Squaring and taking expectations, 

E(X-X) 2 -2E (X-jT) (Y-¥~)+E{Y- T)'=E(Z- Zf 
or a 2 x -2 cov (X, y)+c 2 y =o 2 z 

that is, o 2 x _ Y^^X 4"° 2 y— 2ra 2 ^ay 


whence 



rn co vf ^ j 

a X a Y ’ 




—° 2 x— Y^ a X a Y 


9 3. Limi/s for the correlation coefficient. 

We have seen in section 9 2 if the line of regression of y on 

x is 

y=a+bx, then y=a-\-bx and b=— 2 

Ox" 

We prove that 27/ ( yi—a—bxi) 2 =No\ (1—r*). 

I 

For the sake of convenience, let us assume that .v=y=0 so 
that a=0 . 

Then 27// ( y,—a—bx,) 2 =Zf (y,- bxi) 2 

• • 

I I 

— 27/ y?-2b 27/ x t y^b 2 27/ x,» 

i i 

= W<y v 2 -2W6/t n +6 2 at CTx s 

| -/^V)( pu,,in8A= ^) 

= Na y 2 (1 —/■*) _(I) 


361 


Regression and Correlation; Curve Fitting 

Denoting the sum of the squares of the deviations of points 
measured parallel to the y-axis by N S„ * we have 

N S w 2 =Nof (1- r 2 ) 

or r*=l-(W) ... (2 ) 

S y is called the standard error of estimate of y from the regre¬ 
ssion equation 

y-y=>£ (*-*) 

Similarly the sum of the squares of the deviations of points 
from the line of regression of x on y, measured parallel to the 
x-axis, is NS X 2 . where 

S x 2 =ox 2 (1— r 2 ) ...(3) 

and S x is the standard error of estimate of x from the regression 
equation 

x-x=^l (y-y). 

Gj 

Since the sum of squares of deviation can not be negative. 

(1 —r 2 ) => r 2 < 1 or 



-1 < r < 1. 

Alternatively we 

prove this result as follows. 

Let X 0 =^I, 

*i! 

II 

°X 


Then r— 

_E(X-X)(Y- Y) 

A UV/U f -- 

Vr 

a X°Y 


We consider a non-negative expression 

E(X 0 ±Y 0 ) 2 ^ 0 

or E (A7+ Y 0 2 ±2 X 0 Y 0 ) > 0 

Since E ( X 0 ') = E (y 0 2 )=l, this reduces to 

l + l±2r^ 0 

or -1 < r < 1 /. e. | r | < 1 

We can easily derive the limits for r exploiting Schwarz's in¬ 
equality. Schwariz’s inequality states that for any stochastic variates 
x and y, 

[E (xy)Y < E (x n -) E (y 2 ) 

Proof. We have 

E (x—Ay) 2 > 0 for every real A. 


362 


Mathematical Statistics 


or A 2 E (v 2 )-2A E {xy)+E (x 2 ) > 0 



Let E (y 2 ) 

=a, E (xy)—b, E (x 2 )=c 

Then 

aA 2 —2 A b+c ^ 0 

or 


l/a [a 2 A 2 —2A ab-Yac] ^ 0 

or 


\/a [ a \-b)*+ac-b 2 ] ^ 0 


This implies b 2 < ac 

or 


[E (xv)] 2 < E (x 2 ) E (y 2 ) 

hence 

[t (x-x) ( Y-y)\ 2 < E (x—x) 2 E-y) 2 

or 


P*n < a 2 * a 2 . 

or 


fX * 11 2 < 1 /. e. r* < 1 or 1 < r < 1 

Ox*C tf 2 


Remark : 

If y=mx, 0) r/ien 

r= —l ij m >0, and r= — 1 if m < 0 



>’-wx => y—mx 


therefore 
So that 

y-y=m (x—x) 


hence 


Hn=E (x—x) (y—y)=mE (x—x) 2 —mo x 2 
a v 2 =E (y—y)*=m* E (x—x)**=m 2 ax* 

r= JS!- C ±1 


a* a y I ni | Ox a 

Note. There is an interesting geometric interpretation of the 
correlation coefficient if (A\ Y) is a discrete random variable assu¬ 
ming only a finite number of values, say (x lf (x n , y„). With¬ 
out loss of generality, suppose that E(X)=E(Y)= 0. Then r may 
be written as 

Z *i yt P (*/, yi) 

t 

r “[Z X? P (Xj) Z yP p O'/] 1 ' 2 

t i 


Specifically, suppose n 
equals 2. Then r simply 
represents (except for the 
probability weight) the 
cosine of the angle between 
the vector (x,, x*) and the 
vector O’j, >’ a ). Therefore 


cos 0 = 


OP . OQ 
I OP | \OQ\ 



[recall a . b =ab cos 0] 


Regression and Correlation ; Curve Fitting 363 

(x t . x 2 ) . Qy y 2 ) __ v i + y 2 -— 

VW+*a*) v'OV+.V* 2 ) V(- x ‘i“+^ i ) VW+J 2 ) 

_=r (V Em-£(r)=o) 

=vi^c^*> An 11 

This cosine always lies between —1 and 1. In particular 
when it is equal to 1 or+1, the vectors are parallel in the 
same opposite directions. If the cosine equals zero, the vecto 

are perpendicular. 

9*4. Effect of change in location and scale on r. 

Let u=ax+b t v=cy+d 

then E{u)=a E ( x)+b , E (v) = c E(y)-\-d 

or u=a ax+b, v=cy+d. 

Hence u — u=a (x—x), v v=c (>—>). 

Thus cov (w, v)= E (u—u) (v- v)=ac E (x—x) (y—y) 

<=ac cov (x, y) 

var (w) = E (a-n) 2 =a 2 E (x-x)W- var (X) 

Similarly var (v)=c 2 O') 

cov (w, v) _ ac cov y> _—- 

Now r Uv = V '|var (u)“var (v)} “ I ac | V(var (x) var (>■)} 

ac r 

= - • xy 

I ac | 

The regression coefficient of y on x is 

u n c ov (x, v) (1 [£Sl cov — — (regression 

0 x 2 “ = var (x) (i/a 2 ) var (w) c 

coefficient of v on u). 

If then a and c are equal, the regression coefficient is the 
same for both pairs of variables. 


Fx 1. Find r x , for the two series 

*105 104 102 101 100 99 98 96 93 92 

101 103 100 98 95 9b 104 92 97 94 

(Agra’ 50, 54) 


x : 

y: 



Ages of husbands in years 


364 


Mathematical Statistics 


We set u=x— 99, v=y —98 and prepare the table as under : 


u : 

6 

5 

3 

2 

1 

6 

-1 

-3 

-6 

-7 

© 

II 

8 

v : 

3 

5 

2 

0 

-3 

-2 

6 

—6 

-1 



uv : 

18 

25 

6 

0 

-3 

0 

-6 

18 

6 

28 

2uv=92 

u 2 : 

36 

25 

9 

4 

1 

0 

1 

9 

36 

49 

27^=170 

y2 . 

9 

25 

4 

0 

9 

4 

36 

36 

I 

16 

27v 2 =140 


i 


Hence 

- 






Here 

N=\0, u = ~ 27«=0, 7= -i-27v=0 

iV N 

-4-. 27u 2 =—°—17 1 Su z 140 14 

N 10 1,9 10 _1 



Therefore -0-596 



Ex. 2. 

For the following table : 




Ages of wi ves in years. 




10-20 20-30 30-40 40-50 

50-60 

Total 

15-25 

6 3 - - 

— 

9 

25-35 

3 16 10 — 

— 

29 

35-45 

- 10 15 7 

— 

32 

45-55 

— - 7 10 

4 

21 

55-65 

- - - 4 

5 

9 

Total 

9 29 32 21 

9 

100 


Find (i) The coefficient of correlati on 
(ii) The two regrescion lines. 


(M.A. Agra 1953) 


Regression and Correlation , Curve Fitting 


365 


Ages of wives and husbands. 


x 

y 

15 

-2 

25 

-1 

35 

0 

45 

1 

55 

2 

N * 


Zfv 2 

a 

Ifu 

U 

IfuV 

ot 

VU 

20 ~2 

6 

3 

— 

— 

— 

9 

-16 

36 

-15 

30 

30 -1 1 

3 

16 

10 

— 

— 

29 

-29 

29 

-22 

22 

40 0 

— 

10 

15 

7 

— 

32 

O 

0 

-3 

O 

50 1 

— 

— 

7 

10 

4 

21 

21 

21 

18 

Id 

60 2 

— 

— 


+ \ 

5 

9 

18 

36 

14 

28 

"c 

9 

29 

32 

21 

9 

lOO 

-a 

122 

-8 

98 

Ifu <& uM c 

-/a 

-29 

O 

21 

a 

-8 


IftfoufNc 

36 

29 

0 

21 

36 

122 

V 

-15 

-22 

-3 

D 

14 

-8 

Ifuuoi uV 

30 

22 

0 

IQ 

28 

90 


2 /v or V and 2/wv or uV are not necessary except for check 
purposes only. 








































366 


Mathematical Statistics 


Explanation, x represents the mid-values of the ages of wives 
and y represents the mid-values of 4 the ages of husbands. 

Note. If x=a+cu then a x =ca u 
and if y=a'+c'v then a y =ca t) 
cov ( x , y)=cc ' cov («, v) 

6 =iegression coefficient of y on x 


= —!==— (regression coefficient of v on u ) 
a x * c 


a*~ c 

If c and c' are equal, the value b is the same for both pairs 
of variables. 


Let u 


x-35 

iO 


or 


*=35 + 10 u hence *= 35+10 u 



or y— 40-|-10 v 


p=40+10 v 



c= 10 \ 

c'= 10 / 


In the row prefixed N c the frequencies are given for the in¬ 
dividual columns, and in the column headed by N r the frequenc es 
for the separete rows. 

Let us take as new origin the point (35. 40) whose coordi¬ 
nates are the mid-values of the class 30 — 40 tor x and the class 
35-45 for y; and, as a new unit for each variable, the common 
class interval of 10 years. The row prefixed V gives for each 
column the sum S/v. Thus the sum —15, in the column w=-2, 
is obtained from 


6(-?)+3(-l)=-15. 

The column headed by U gives for each row the sura Lju. 
Thus the sum —22, in the row v= — 1, is obtained from 

3 (-2)4-16 (_1 ) = —22 

The last column, headed by vU f gives the sum 27 fuv for each 


row. 

The coefficient of correlation is given by 


^-27 fuv — u »■ 

^where u — 

98 / '00 -(- 8 / 100 ) (— 8 / 100 )_. 

= v Ll 12 1UO—C— 8/100>->{ 122/1UU—c — 8/1 v.0>}) 

(here 7^= 100) 



Regression and Correlation, Curve Fitting 


367 


0*98-*0;i64 -9736 

= 1*2136 


l'2l— 0064 
8 

100 


=0*802 


1 


(ii) w = v= f^o) = ~0 08 ; cov ( u , v)= Z fuv -u v 


= •9736 
2 


<7 U * = CT C 2= — £ fu 2 — U 2 =— 2 -t— 

„ ^ -W" « 100 l 100/ 

— 1-22 —*0064 = 1*2136 
The line of regression of v on u is 

— cov (u v\ _ 

V—v= - — (u—u) or v-fO 08=0-802 (i/-f0 08) 

a u 

The line of regression of u on v is 

— Cov fw v) - 

U-U= -^-(v-v) or w + 0-08 = 0-802 (u+0-08) 


Now 


3c=35 + 10w = 35--8 = ?4-2 


y=40+10 v=40--8 = 39-2 

ox=c o u = 10{(-v/l'2l36)} 

<x, = c' a v =10 (v/12136) 
cov (a:, y)=cc' cov (v, v) = 100 ( 9736) 

Thus the line of regression of y on x is 

>-?= or >>-39-2 = -S-(*—34-2) 

ie. 9-2 = -802 (a:— 34*2) or >>=0-802 A-f 11*772 

The line of regression o x on y is 
_ cov (a, y) 


A—A = 


1 V * 


(y-y) 


or a— 34*2=0 802 (>>-39 2) 
ie. *= 0-802 y + 2 762. 


Exercises 

Ex 1. Calculate r CT for the following ages of husband (x) 
and wife (>>) : 

a : 23 27 28 29 30 31 33 35 36 39 

y: 18 22 23 24 25 26 28 29 30 32 

(r=0 92) (Agra’ 52) 

Ex. 2 . A computor while calculating r„, from 25 pairs of 
observations obtained ihe followirg constants : 

n=2 , 27x=125, 27*‘ = 650, 27>’=lOO 27>- 2 =4 0, Zxy=5VS 
A recheck showed that he had copied down two pairs (6, 14), 
(8, 6) while the correct values wc;c (8, 12), (6, 8), obtain the 



368 


Mathematical Statistics 


correct value of the correlation coefficient. 

[Delhi (Hons) ’52; Punjab 53] 

(Ans. r— 2/3). 

[Hint. There is positive mistake of 12 in and negative 

mistake of 24 in Zy- t use the formula 

r={^E x y-- X y )IJ{(^£ M )(T ^ 

Ex. 3. Compute the correlation coefficient r from the follow¬ 
ing data : 


Variate 

y 


l 1 - 5- 10- 


Variaie x 

15- 20- 25- 30- 35- 40- 


Totals 



Compute the regression equations and their standard errors 
for the abo\e data : 

[Ans. r = 0 766, v-23 2--=0’8575 (.v-24'5) 

.v —24 5= 0 6 41 ( y - 23-2) 

5 ! / = V{9i 51 (1 — 0776 2 )}, S x =\ '{73 (1 — 0 766 2 )}] 

Ex. 4. The following marks have been obtained by a class 
of stude .ts in statistics (out of 100) 

Paper I : 45 :>5 56 58 60 65 68 70 75 80 85 

Paper II : 56 50 48 60 62 64 65 70 74 82 90 

Compute the coefficient of correlation for the above data. 
Find also the equations of the lines of regression. 

[B. Sc Agra 1954] 

[Ans. r= 918, ,\= 85>*-h9 48, v=*99.Y-f- 1] 

Ex. 5. Two independent variattes .Vj and .y 3 have means 5 
and 10 and variances 4 and 9 respectively. Obtain the correlation 
coefficient between 

) 1 = 3.vi4 4.\* a and v a «= 3 -Yj—a„. 


[B.A. Pb. 53, 54] 




Regression and Correlation , Curve Fitting 


369 


[Hint. E (Vi)=5, E (x 2 )=10, 

E{x i-5) 2 =4 f E (x 2 — 10) 2 =9 whence E(x x 2 ) = 29, E(x 2 2 )= 109 
E(yD gs *3E(x l )+4 E(x 2 )=55, E 0’ 2 )=3 E (x 1 )—E(x i )=5 
E (y x y 2 )=E [9x! 2 —4x 2 2 +9x 1 x 2 ] 

= 9 E (x x 2 )—4 E{x 2 2 )+ 9 E(x x ) E (x 2 ) = 275 
Cov (y l9 y 2 )=E (y x y 2 )-E (>,) E (y 3 ) = 275-275=0 


Hence 


r= 


cov (y x , y 2 ) 


V{var (>’i) var (>* 2 )} 


= 0 



Ex 6. Establish ihe formula 

nro x <jy=n 1 r 1 a a + « 2 r<> a a +«i dxi dv x -\-n 2 dx 2 dy 2 

x i yi x 2 )'z 

where n u * 2 and n are respectively the sizes of the first, 
second and combined sample, (x l% J’i), (x 2 , y 2 ), (x, y) their means, 
r v r 2 and r their coefficiedts of correlation, 

(« . « )> (® » )» Oy) 

x i y i *2 y-z 

their standard diractions and 

dx i=Xi—x, dy x =y x — dx 2 =x 2 —x, dy 2 =y 2 —y. 

[B. Sc. Agra ’65] 


[Hint. 


We have 
- = *, Xl + *2*2 

* i +*2 


- *i )\ + n 2 y 2 

"l+*2 


n n n x n 

27 (x-x) 2 = 2 (x x +x 2 -x) 2 = 27 (, x x -x) 2 + Z (x 2 —x) 3 
1 1 1 *i + l 


n x _ n 

=* 27 (*i—x,+Xj— x) + 27 (x 2 -x 2 +x 2 —x) 3 
1 *i + l 

*! * 

= 27 (x,—X jVH-Wj </x! 2 + 27 (x 2 Xa)+* 2 dx 2 2 

1 *i+l 

or n OK 2 =n x (a* +dx x 2 )+n 2 (a* +</x 2 *) 

Xi x 2 


Similarly * a y 2 =n x (a 2 +^>V 5 )+*2 («* +</x 2 3 ) 

y x y% 




*i 

27 (Xj—Xi) (y x -yi) 

l 


a a 

*i y i 

* 

27 (x 2 —Xg) 0’2-pj.) 

«i+ 1 


a <r 
*2 >'a 



370 


Mathematical Statistics 


For the pooled sample 

- 7 - S(*-*) 0 '-v) 
r= n 1 _ 

ay <Ti/ 


or n r a x n>j = Y. (x—x) (v— y) 

1 

n, n 

= 2 (-Vi—x) (r,—T)+ 2 (Vo-y) ...(1) 

1 ni +1 

H| "l 

Now v (.v, — a) (v, v) = v (.Ti-.Vj-f- V,—.v) 0*1 — >*! -h.Vi—>) 

1 1 


"i 

= 2 (x , x, -}- d .\,) (rx-.vi-l-r/r,) 

1 



O’i—>' i)+</x x 2 0*i—>i> 

1 


+ J.V, v (.Y X —.Y,)+/Ii d.\\ <ly, 

1 


— n x r x a a -f/Jj d. y x </r a 

A 1 .Vi 


r "i "i i 

V 2 O’j—J’ J )= 2 (.y 1 -.y 1 )=0 

Li l 


S 


imilarly 




2 (.Vo — ,v) (v a —V) —n> r„ c a -f/i„ r/.v« </r 3 
«,+i " V; - v - 

Substituting in (1), we get the required result]. 


Fx. 7. Find the correlation coeftlcint of the combined sam 
pie given that: 



Sample I 

Sample il 

Sample size 

100 

150 

Sample mean (a) 

80 

72 

Sample mean (T) 

100 

118 

Sample variance (a 2 a) 

10 

12 

Sampla variance (a-,,} 

15 

1 

Correlation coefficient 

0 6 

0 4 

[Agra B Sc. '64] 



Regression and Correlation Cnrve Fivite 


371 


[Ans. r=o'8186] 

Ex. 8. (a) If a linear relation exists between the two chance 

variables x and y, prove that correlation coefficient between 
them is +1 or —1. [Agra M. Sc. 1952] 

(b) Define when two random variable are said to be (i) in¬ 
dependent, (ii) uncorrelated. If P is the coefficient of correlation 
between two independent variables show that, ? = 0. 

By taking their random variables x and y such that (i) x has a 
uniform distribution are (—1, 1) and (ii) y=x- show that the 
cunverse of the above statement is not true. 

(c) Two variables x and y have the regression lines with the 
equations 8x— 10y+61=0 and 40.v—I8y=214. In V(x) = 3, find 
the means of x andy, V(y) and the coefficient of correlation bet- 
x and y. 

Ex. 9. The variaoles x and y are connected by the equation 
ax-\-by-\- 5=0 Show that the correlation betweeen them is —1 
if the sign of a and b are all alike and -1-1 if they are different. 

(Lucknow B. Sc. ’67, Agra ’64) 

[Hint. ax-\-by+c=Q 

.*. a E(X)+bE (y)+c=0 

Hence X — E[x) = — 

Cov (x, y)=*E(x-E(x) {y-E(y) = -~a\ 


Hence 


O ^ O 

G~x = 

a z 

cov (», y) _ — b/a 
ox o v | b/a | 


= -f 1, if a and b are of oppoiste sign 
= —1, if a and b are of same sign] 

Ex. 10. If u=ax-\-by and v=bx—ay , and r is the coefficient 
of correlation between x and y and if u and v are uncorrelated, 
show that 

(Tu^ =ff *°v (a 2 +£ 2 )(\/{(l— r 2 )}. 

[Agra ’69, Delhi ’67] 



[Hint. 


u*=ax+by 

v=bx—ay 



372 


Mathematical Statistics 


Hence | j [m 

- -[; -X i 

[X 

\ a 

y] 

L 6 

] 

— a J 

r" 2 

i/v -i r a b pi 

r x * 

.TV 1 
* 

■ 

or 1 

= 



1 

L UV 

v s J L b —tfJ 

L yx 

y- - 

1 

r 2?* 

Zuv -1 r a b 

V [ 

27 A' 2 

Ixy . 

1 27uv 

£v l J Lb —a 

J l 

27vv 

27.v 2 * 

r var u 

COV ( U , v) -1 rfl 

v) var v -* *-b 

b- 

lit var 

X COV (.Y, v) 

or [ 

L COV (it, 

— a - 

1 L cov (.Y, v) var y 



(assume /:\a) = £(;) = 0 

Taking determinants. 




| var u 

cov (u, y) - a 

6 : 

! i var a* 

cov (y. y) | 

\ COV (M, 

i) var v 6 

— a 

1 cov (x, y) var y \ 


or 


If it and v are uncorrelated, cov (w, v) = 0 
var //.var v = (a 2 +6 2 ) J [var y var j-cov- (.v. r)] 

cov (.v, v) 

c u d u ={a % -\-b t ) ox<s v (1— r-) 1 - 2 r= jj 


E\. 11. Show that, if .v\ y' are the deviations of the var ia- 
hles x and v from their means. 


and r 

2 A' 

Deduce that — I < i ^ 1 


- i _ _L r /• / 

r 2 A'/ \ a,, J 

__\ JL if (*j + /lV 

2 A’ i J ‘ \Cr O, ; 


] Lucknow ’67, ’65, Pb. '59 (s)] 


Him; l'r V 

2N i \«Tv °y / 

-l-J i ' - /*-.*+V,v.] 

[A Ga " / Acr,,- / j\(i\G ij J 

= l-i [I + l —2r]=r, since c-, =(]/A’) 27 /", a',* 

.y\>\ j/ov^y 

Similarly the other part. 

r /( (li -*y > 0. 

I \C X n ./ / 


Since 



Regression and Correlation, Curve Fitting 


37 3 


r < 1 from first relation and similarly from the second relation 

-l 

i.e. -1 < r < 1] 

Ex. 12. (a) If u=(x cos x+y sin a), v=(x sin x—y cos a), 
show that a u =a v =a and r uv = q, given that x =y = 0=r xy an d 
a x ~a,j=o. fDelhi ’52, Krk. ’67J 

(b) A computor while calculating the correlation coefficient 
between two variates x and y from 2> pairs of observations ob¬ 
tained the following constants : 


o=25, 27 *=125, 27 x 2 = 650, 27 >-=100, 27 / 2 = 460, 27 *>-=508 
It was, however, later discovered at the time of checking that 

x I y 

he had copied down two pairs as 7 


8 


1 4 while the correct values 
6 


x | y 


were 8 | 12- Obtain the correct value of the correlation coefficint. 

6 I 8 

[B. A. Hons Delhi '52] 

[Hint. Corrected 27 x=125 — 6 — S-f 8-|-6= 125 

2 > = 10U-14-6+12 + 8= 10J 
2x 2=650-6 2 -b2 + 8 2 + 62 = 650 
27 >; 3 =460—14 3 - 6 2 + 122 + 82=436 
27 x>>= 508 — 84 —48 + 96 + 48 = 520 
1 


n 


I' xy—xy 


r = — 


&-r)\ 


520 125 100 


25 2d 25 


= 2/3] 


Ex. 13. If x and y are standardized random variables, and 


find r (x, y) 

[Hint. 

Let 

then 


r {ax+by, bx+ay)^^^, 

fB A. Delhi ’71] 
E(x)=E(y)=0, £ (x*)=£■(+>) =l 
u=ax+by, v=bx \-ay 
£(</) = £(v) = 0 

E(u 7 )—a 2 £(x 2 )+ 6 2 E(y*) + 2ab £ (xy) 

o 2 y - a 2 -\-b 2 -\-2ab r x ,j. Simi 2 arly <j l ^b 2 -\-a 2 -\-2ab r xv . 



374 


Mathematical Statistics 


or 


Also E(uv) = Lb E(x z )-rab E{y z )-\-{a--\-b‘'-) r X y 
r Uv a u o r —2ab J r{a--\-b' 1 ) r X y 

1 +? °r (a n -+b 2 +2ab r X y) = ?ab + (a-+b-) r xy 


tr + b- 


r« 2 +/> 3 > ] 

rx1, ~[(a-~b-)--2ab] J 


Ex 14 (a) .Vj and .v 2 are two variates with variances a,* and 
respectively and r is the condition between them. Dctermin- 
the value of the constant k such that 




x. -f Aw., and x,-t- x 2 

i 1 » 1 .. •* 


02 


are uncorrelated. 
Ans. k 




--tl 


[B. A. Delhi '60, B. A. Madras ’69] 


(h) If.r, v, rare uncorrelated random variable with s-d’s 
5, 12 and 9, respectively and if // = x-fT and v= < > , +* 5 evaluate the 
correlation coefficient between it and v. 

[ AnS - ft] 

r \. 15. The variables x and y arc normally correlated and 
u and v are defined by 

u = x COS a-f.V sin a 
v= \ cos a — x sin a 

9 

Show that u and v will be uneorrelated if : 

- 2 r r, x - Gy 

tan 2«= t,- 

c-.v — n-y 

[M. A. (stat) Delhi *5S, Lucknow B. Sc. *64] 

[Hint. ~ =.x cos y.-f J sin a 

~T = \ cos a —x sin a 
Cov (m, v) = E( wv) - « 7’ 

= L [sin a cos a (v- — x-) + x\ (cos : a — s»n 3 a)] 

— sin x cos x ( r-— x- ) — xy (cos-x —sin 2 a) 
= sin a cos x (z — o 2 x ) + (cos-a — siu-y) cov (x, y) 
u and v will be uneorrelated if cov (?/ v)= 0 
Hence .'. siu 2x (g- x -n 2 ) —cos ?v. r g x a > 


or 


tan 


2.-4^:. I 

o- x — n J 


tx. 16. If x and y are uncoirelated with means zero and 
variances n 2 i and <v 2 respectively, show that 

u=X cos a -f y •''in a 



Regression and Correlation , Curve Fitting 


375 


v=x sin a —y cos a 

have a correlation coefficient r Uv given by 


r uv — 


o o 

o^ — a-r 


[(a 2 i — a 2 «) 2 + 4 o~ L a- 3 cosec 2 2a) 1 / 3 

[M. A. Delhi ’65, ’68J 


Deduce that 


o 2 i 


<* 2 i + o- 2 




a-, 


_ 2 ' ^-2 
a l~ G 2 


[Hint x=y = 0 => m= v=0 

a 2 v =E (i/ 2 )=cos 2 a.a 2 i-j-sin 2 x a 2 2 , 

since cov ( x, y)=E(xy.—0 
a i r = E (v 2 )=sin 2 a.G 2 | + COS 2 a<? 2 2 
cov (a, v)=E (</»’)=sin a cos a.a 2 i —sin a cos a a- % 



cov (u, v ) 


which is 
of r are 


sin a cos a (o 2 i —a 2 2 ) 


v'lsin 2 * cos 2 c o* 2 (cos 4 a+sin 4 a)} 


•» •> 
o 2 ,-a- ? 


[(o*i+ «*,)* + ©*. a*. (4 cosec 2 2 a- 2)]*' 2 

_I 

[(o 2 !—a 2 2 ) 2 4-4 a\ g% cosec 2 2 aJJ 
2 , 4 a 2 , o "2 

r“uv=l — 


(a 2 j — a 2 *)* Sin 2 2a-t-“ia , ' 1 a 2 a 
numerically greatest where a = ±7r/4. The extrem values 

, ( a 2 , —q 2 2 ) 


Ex. 17 ( a ) Let *i» *■! and be uncorrclated variables each 
having the same standard devition. Obtain the coefficient of corre¬ 
lation between (*!+**) and (x a +x a ). (I. A. S. 1949) 

[Ans. r=£] 

(b) Xi, x., x 3 are three corrected, each with variance a 2 and 
coefficient of correlation between any two of them is p. If 

-_ X 1-\~ X 2 + X 3 

3 

prove that K(x) = (a 2 /3j (1-f-2p). 

(c) If U=ax and V=by t obtain the regression line of U 
on V. 

Ex. 18 (a) If ox and ay are the standard deviations of two 



376 


Mathematical Statistic» 


uncorrelated statistical variates x and y, prove that the standard 
deviation of ax -f -by is 

\/{a~ n~x+b z c 2 y) 

(b) Show that if a and b are constants and r is the correlation 
between .v and y, then the correlation between ax and by is equal 
to r if the signs of a and b are alike and to — r if they are different 
Also show that, if the constants a, b, c are positive, correlation 

between ax+by and cy is equal to 

( aro. x -f ba ■,)/ yj (fl 2 e 2 .r+ b 2 c- v -f T abra x a-j) 

Ex. 19. Two random variables have the least square regres¬ 
sion lines with equations 3 a-4-2_v— 26 = 0 ana 6.r-fv—31 =0. Find 
the mean values and corresponding coefficient between x and y. 

[M.A. Delhi 1960; Bombay ’68] 

Ex. 20 (a) In a partially destroyed laboratory record of an 
analysis of correlation data, the following results only are legible : 

variance of a = 9 

Repression equations : 8 a —SOr-I-66=0, 40— 18j>«=214 

what were (a) the mean values of x and v, (b) the standard devia¬ 
tion of v, and (c) the coefficient of correlation between x and v? 

* 

[I.A.S. 1947] 

[Hint. Since the regression lines pass through the means a 
and y. 

8a— 10y+ 66=0, 40x-l8y=214 => a=13, y= 17 

The lines of regression can be put in the form 
v *8.x ; 6 6 and .v=*45 >’-f5 35. 

Hence cocffs. of regresson of y or x and of a on y are 

r n " = S, r — Gr ~— *45 => r~ — 'A5 X ‘8 => r=06 

Gx Gy 

o-x~ 9 => c x — 3 hence o y = — 4.J 

(b) Two lines regression are given by 

.v + 2r — 5 = 0 and 2.v + 3.r-8 and g-* = 12. 

Calculate the mean value of a and v, variance of y and the coeffi¬ 
cient of correlation between v end y. [Allahabad ’52] 

[Ans. r— — *86, g-,— 4) 

Ex. 21. By minimizing S f, (x, cos a-by, sin a -/A- for varia¬ 
tions in a and p, show that there are two straight lines passing 
through the mean of distribution for which the sum of squares 
of normal deviations has an extreme value. Prove aho that their 



Regression and Correlation ; Curve Fitiing 


377 


slopes are given by 

tan 2a=^—[M. A. (Madras)] 

Ex. 22. Find the most likely price in Bombay (*) corresdon- 
ding to the price of Rs. 70 at Calcutta ( y ) from the following 
data : 

y=6?, *=67, (Jx = 3*5, a v =2‘5, r X y=0 i. 

[Agra 54, Punjab 61] 

[Ans. Rs. 72*6] 

Ex. 23. If*„ * 2 and * 3 are three variates measured from 
their respective means as origin and of equal variances, find the 
coefficient of correlation between *x+* 3 and * 2 -b* 3 in terms of 
f i 2 » ^3 and r 23 and show that it is eq ual to 

(i) if r„-r tt =0 or (ii) if r„ = f as =l. 

[(I. S. I. Cal. 56, Agra B. Sc. 68] 

[Hint. E (XjO = E (x 2 ) = E(x 3 ) — 0 

y ( Xl )=V >x 2 )=V (x 3 )=o s , say 

E (.y 1 4-* 3 )=£ (* 2 4-* 8 ) = 0, 

{n 3 ) + 2 cov (* lf * 3 ) 

= o 2 +a 2 + 2a 2 r l3 

=2o 2 (l+r l8 ) 

_ E(.Y,+x a l(ro+-y3) 

Xl~\~X 3 , *2"f*X 3 \/i / '{\Xi-\-X 3 ) t-X'a-1“A3)} 

r E (*!X 2 4- *1 *a + *8*3 + * ! t) 

~ (1 -F'-xa) U+'W 

_ a 2 r ia -t-q 2 r l3 -4-q~ r 23 -[-q~ 

2o“ Vi(H" r i») U+ , j 3 )} 

_ r 12 -fri 3 +r 23 +1 1 

2 V( 1 + r l3 K 1 ■+■ r 2»)} J 

9-5. Correlation of Ranks. Suppose that a group of n students 
is examined in Physics and Mathematics and the position (or 
rank) of the /th student is represented by the letters *,, y, in the 
two subjects respectively. Then the coefficient of correlation 
between the *’s and the y's is called the rank correlation coefficient. 
Let us assume that no two students get the same position in either 
subject, and hence each of the variables takes the values 1, 2, 

3,..., n. Therefore 

Since */ = 1 or 2 or 3...or n, 



378 


Mathematical Statistics 



As a rule, a\ is not equal to y, Let d, denote the difference, 
so that 


dj—Xi — Vi 

Since 0 — x—y 

We have d l = l.x l — x)—(y l -y)- x,' — y/ ...(2) 

where x' and v' denote the deviations of the variables from their 
means. 


n 


Therefore 27 Jr = - x, + - v/ J -2S x.'y, 

l 1 


or 


2 


1* .v,'r/=2 (l = + 2* + ...-/I'-)-2/i 


S' dr 


Since S a*/ 11 * 1 - = - I *r-x\ 

n ti // 

v , , ;i( /i+ 1) (?w -H)_w(w+n g tr . 9 
- - * ~ 6” ~ 2 ' - “ ’ 

12 4 




The coellLicnt of correlation between Ihe variables is given by 
r= 


/: (A*— A* \ ( V—v) _ - A 


\ {E ( V — A*)" E (.»•-?)- V V, -)} 

— l) , v » 

•'12 r =x ^dr t 

run- - P /i 3 —rc 

" 12 


This formula for rank correlation coefficient is due to Spear/nan 
Note. In case two or more students have the same rank in 
either subject, tlie factor in m-— 1) know.t as correlation factor 
is added to the sum 27 t/r, where m is the number of times a 


variable is repeat d 

If the correlation is perf et then d. =0 for every /, and I. 
If ti e ranks in the two subjects are exactly the reverse of each 

other. Xi+yt s=w + !• Hence 

.y,- — Yi—di and .v, V/—ii T 1 yield 

d x — ?.v. — (.'i -\- 1) 



r* 


ism zrJi. C~r r - m ~ Cuns F :rr_rc 




asd therefore 

V J I 


= -ir : — 4 {r.— u -V —^ '—I'- 
=.irT i -4 ir-r - i ! 

* r-1 •<>-' > , _ ^ 

= s _-— — - 


•n «• 




Thus iiftfcis cie >= — 1 sad there is f«*-< ;roerse .v - 

lotion. 


• * 


Ex. Show dun •/ the ranks for m* r :< ‘ - « ; ' - '*• * 

r=J. end if they are frrer t[v coneiJ tJ r = ~ ; ' v " **'*' * ~ 


the maximum value of - Jr is $ 1 ^ ~ n - ' v 

-l^r-1. 


tV>tf!h» >r 


Exercises 

1. The rankings of ten students in two subjects .< and S .»re 

as follows : , 

A : 3 5 S 4 7 10 2 / * . 

B: 6 4 9 $ 1 2 o 10 * ^ 

What is the coefficient of rank correlation ? I" A P*lhi 51. 
[Hint, d, : -3, 1. -1, - 4 - «»* “ l * l ' * . 

, \ - J lV' •H'l'U'V 

r J,’=:i4. '•=>-- llU .i’0-Tt ■ ■" I 

2. Calculate correlation coefficient r and rank coin lata n 
coefficient r' for the distribution of sales fvl and evpc.tses S«*» 

as under: . „ . 

r • 50 50 55 60 65 65 65 Ml 00 .0 

*11 1? 14 16 in 15 15 .4 15 15 

(Ads. r-O-787, r'=0-751 | Bombay 4.S| 

3 Calculate the rank correlation coctbcient. gi'en that the 

ranks of the sam 16 students in Mathematics and I’hyates are as 
Wlows TWO members within brae ets denote the ranks o, the 

* sa, ( ,, v ...»<«. ** <5sa « aft? 

4 Calculate the rank coticlaiion coelllcient lor the marks- 

distribution of 10 students : l)0 6? (l5 39 

Marks in Mathematics . 8 56 >8 75 /. • 

Marks in Statistics : 84 5 . 9. 60 68 62 36 b^.7 

[Ans. *05 app ox ] 


380 


Mathematical Distribution 


5. Ten competitors in a musical test are ranked by three 
judges (X Y, Z ) in the following order : 

A': 165 10 324978 

Y: 3 5 8 47 10 2 169 

Z: 6 4 9 8 1 2 3 10 57 


Comment on the pair having the nearest approach to common 
likings in Music. [Allahabad ’52; Delhi ’57] 

[ r XY ' =s ~ 7/33* r YZ =— 49/165, 7/11. The pair of judges 

X and Z has the nearest approach to common likings in Music] 

9’6. Correlation Ratio. The correlation ratio of v on x is denoted 
by ri ux and in analogy to the correlation of y on x is given by 
r 2 —S v z /a'j", it is defined by 

>)V=1 -S V 'W -(I) 


where 


1 


•SV 2 = 2 2 f, (y,j — yi) 


i J 


- * 

and yjyx is positive. From (1), we find 



This implies vjV^ri. 

Note that NSf 2 is the sum of the squares of the deviations of 
the points from the mean in each array, and it must be least. 
Hence 

where NS t 2 is the sum of the squares of the dev.ations of points 
from the line of regression of y on .r 


Hence Sf 2 ^Sf*> af- (l-ij s ,*)<c, 2 (l-o 2 ) 

rr Jx ^r- =>0<r2^Yj 2 yr <l ...(3) 

Incase the regression is linear, t) 2 y,-=r 2 , otherwise 



_ o / .. fTm v Gm* 

Theorem Show that r)y.=- » 7 )*y = - . 

<7;, Ox 

where c,*x and c mV arc r/ic 5 D.'s of the means of vertical and hori¬ 
zontal arrays respectively , each mean being weighted by the 
corresponding frequency of the array in which it lies. 

Proof. = 27 n , (V|— ?)* by definition (1) 

M t 



1 

A' 


£ 2 ft (yu—y ) 9 



Regression and Correlation ; Curve Fitting 


381 


Since 


= ~ f jf, ~ z n (y.->)- 

2/i (yu— >’/)=0. n t =Z fij 

i > 


Thus G'J 2 = S’j’ i -\-G- m y •••'2) 

that is, the variance of y in the distribution is equal to the sum of 
the variance within the arrays and the variance of the weighted 
means of the arrays. 

As Sy’“ — Gy“ (l-IQVc) 

(Tv 2 —o 2 m v=cry 2 (1— rfvx) 

GmJ 


or 




<J 2 mV 


%*• = 


ay" G'J 

Thus the correlation ratio v\ Vx , is the ratio of the S. D. of the 
weighted means of the array’s ol y % s to the S. D. of all the y's ol 
the distribution 

By similar reasoning we shall have 

Gfn.x 


V*y = 


Gx 


Ex 1. Prove from first principles : 

(i) (B. A. Hons., Delhi 61) 

(ii) ^ = —■ 

G z 

Ex. 2. Prove the relations : 

(i) Z n ( ( y t - Y f )*=No u z CnV-r 2 ) 
where Yt is the estimate of y t from the regression equation 

y-y=^i (*-*) 

(ii) 2 tit (p/-y) 2 =2 n t CP/—T,) 2 + 2 m (T,->) 2 

i i * 

(iii) t)Wv 2 =*i/ 2 (>) 2 V .-r 2 ) + r 2 <ly 2 . 

Ex. 3. If Yt is the estimate of yt obtained from the regre¬ 
ssion equation y=a-\-bx, then prove that 

(0 a y = I r I "v 

<“> IBf- 1 r 1 

[Hint. We have Yt = a+bx t . The normal eqns are 


2 ft 0'i — Yt}=0 => y = Y 



382 


Mathematical Statistics 


Thus 

or 


Zfxi (y,-Yi )-0 
'Zfi ( Yi—y) 2 =Nr~o v z as above 
c y 2 =r t a v * => a u — | r | c tf 


Let y= Y=0. Then 2/,7i (yi— Ti)=0 yields 
Z fiyiYi~T.fi Yf=Nc y 2 

Thus 

NOyGy CyO y a y 


9*7. Intraclass Correlation. We may be interested in finding 
the correlation between the measures of some common characteris¬ 
tic for pairs of members of the same family or class : That 
common characteristic may be weights of brothers, or the heights 
of sisters. The relation between two members of the same family 
is a reciprocal one, for if we suppose that F and Q are members 
ot the same family and if x measures the characteristic for P and 
y measures the characteristic for P, x and y can be mutually 
interchanged for P and Q. Thus each pair of members, P and Q 
will contribute two entries to the correlation tables and as x and 
y can be interchanged, the table will be a symmetrical. 


Table of measurements 


\ 

No. of \No. of 
Members \Families 
of the family\ 

\ 

1 

2 

3... 

• 

l 

h 

Total 

1 

A'n 

*>1 

*31 

*n 

*/.i 

h 

2 

A, 2 

*82 

*32 

*,2 

*/. a 

h 

3 

• 

*13 

• 

• 

*23 

6 

• 

*33 

• 

• 

*13 

• 

• 

*/i3 

• 

h 

• 

j 

• 

• 

• 

• 

*1/ 

• 

• 

• 

y 

• 

*3/ 

• 

XiJ 

• 

• 

• 

• 

*hi 

h 

k 

-YU 

X zft 

*8 k 

*.* 

*hk 

h 

Total 

k 

k 

k 

k 

k 

hk 


Suppose there are h families with k members in each. Then 
there are k {k — 1) pain of va ues for each family, and the total 



Regression and Correlation', Curve Fitting 


383 


number of pairs of values in the table is 

N=hk (k — 1). 

Xu measures the characteristic of the jth member in the Ah 
family. Consequently i = l, 2,..., h and j = l, 2,..., k. In the ith 
family, if x ti is the x then y is xn or x it or x l3 ...or (this 
excludes xij). Thus each value x,i occurs k—1 times as an a, and 
the mean x for the bivariate distribution is 



(&— 1) 2 2 x,j 
i J 

hk (k — 1; 




Also y=x, since the table is symmetrical. 
The variance a x 2 of the x’s is 



* * i 


hk (k— i) hk 


2 2 ( Xij-x)- 
* J 



Also ai/ 3 =c** in virtue of symmetry. 

Let a* 2 =o u 2 =a 2 ...(3) 

The coefficient of Infra class correlation is given by 

2 2 2 (xu - x) (x it - x) 

* ) i 

p _ _ 

hk ( k —I) a*?!/ 

(j, 1=1, 2 ,..., k; jz£l; /'=!, 2 ,..., h) 

Now Summing over / 

2 2 2 (Xfj—x) (xu —*) = 2 2 (xi,—x) (kx t —x { j — (k—\)x) 

i J l i J 


Since 2 AT (t =the sum of all values for the ith family except 

i 

x t j=kx t —x t j, where x t is the mean for the ith family. 

Thus 2 2 2 (Xij-x)(xu~x)=k 2 2 (x t ,-x)(x t —x)~ 2 2(*i, -a') 3 

i J l i i if 


Now k 2 2 (xn—x)(xt~ x)=k 2 (a:/— x){kx t — kx) 

i I I 

— k? 2 (x,-xy-=kVio„r 

i 

where o m z is the variance of the means of the families. [Note. 

2 (x ii —x) = kx i -kx since j=], 2,..., A;.] 

J 



384 


Mat hem itical Statistics 


Thus (4) can be written as 

_ k*/ta m * —hkc * _ ka m 2 — a 2 

hk\K— 1) C 2 (£-1, 

Limits /or intraclass correlation coefficient 


-ra[ 


k ^ 


-] 


! + (*-!)/•==* ^-^0 


...(5) 


r> — 


k-l 


Also, since 


Cm 


1 


hence 


l+(fc—l)r<* 

r<l 

1 


k— 1 


<r<\. 


Note, r is obviously independent of the origin and the scale. 
Ex. In five families of 3, the heights of brothers are in inches : 
69, 70, 71; 70, 71, 72; 71, 72, 73; 72, 73, 74; 73, 74, 75. 

Find the intraclass coefficient of correlation. 

[M. A. (Sfat) Delhi ’66] 
9 8. Bivariate Normal Distribution. The bivariate normal 
distribution is a generalization of the normal distribution for a 
single variate It is derived as follows. 


Assume first that the variable x is normally distributed with 
S D. ni and mean zero. The probability that the random variable 
x will fall in the interval dx is 



1 


v(i-) 




Assume next that the regression of y on x is linear and 
homoscedastic. Then, if we assume that S.D. of y , is in the dis¬ 
tribution is a 2 , the common variance of the array’s of y’s is S v 2 
which is equivalent to ct 3 2 (1 —p 2 ), where p is the coefficient of 
corjelation between x and y. Finally we assume that each array 
y’s is normally distributed. Then, since the regression of y on x 
is linear, the mean of each array is on the line of regression 

y—x or v = p — x (in this case) 

c i Ol 



Regression and Correlation, Curve Fitting 


385 


and the variance of each array is ct 8 8 (I — p 2 ), the probability that 
the random variable y in an assigned vertical array will fall in the 
interval dy is 


dp —_!_ 

2 <73 V’{2«(1— p 2 )} 


2<V (l-P 2 )^ 


dy 

...( 2 ) 

Hence the probability of a pair of values (at, y) falling in 
the elementary rectangle dx dy is 

dP ' dP >=~ - - »■ — e^P SH'+Sf}]**' 


27r<7 1 <J aV '(l“P :4 ) L 12 <7 j 2 (1 

1 f* a 2pxy yt) 

_}_ 2(1— P 2 ) l CTl 2 <7l<*2 + <J 2 2 ] 

27tct,ct 8 V^( 1 — P 2 ) 


dx dy 


The probability density <f> (x, y) for the distribution is 


* (*» y) 


i 


1 |x 2 

2(1-P*)W" 


2W 

ct,CT 2 <*a 2 


I 


277(7^2 VO*” P 2 ) 

— co < x, y < + oo ...(3) 

Such a distribution is called a bivariate normal distribution , 
and the variables are said to be normally distributed. 

The surface z=<t> (x, y) is called the normal correlation surface . 
Since the relation (3) is symmetrical in x and y, we conclude that 
the regression of x on y is also linear and the variance of each 
array of x'j is ct, 2 (l—p 2 ). The values of x in each horizontal 
array are normally distributed, with mean on the line of regress¬ 
ion of x on y, that is, 

x=p -f y 
<t 3 

The curves along which the probability density is constant 
are given by the equation 


x 2 2p xy . y 


<7l 


2 


+ — a = A* 
aio* ct 8 2 


...(4) 


This equation repeesents an ellipse with centre at (0, 0). We 
write (4) as 

x 2 2pxy , x 2 = ] 


+ 


(A<r a ) ; 


(A ct,) 2 (Act,) (Act*) 

Exercises 

1. Write down the probability density function of the five- 
parametric bivariate normal distribution in the variates x and y, 
and show that 

(i) x is distributed normally with variance ct* 2 



386 


Mathematical Statistics 


(ii) y is distributed normally with variance g u 2 

(iii) For any given x , y is distributed normally ab:>ut mean 

(pov/ox) x with variance g v % (1 —p 2 ). (M.A Delhi 60. 65) 

(iv) For any given y, x is distributed normally ab.'Ut mean 

(p exhv) y with variance o x z (1 — p 2 ). 

(v) Both the regressions of y on x and on y are linear. 

(M A. Punjab *60) 

(vi) The variances of all the y arra>s are same (i.e. regression 
y on x is homoscedastic). 

2. Find the probability of simultaneous materialization of 
the ineqalities x > E (a) and v > E (>•). 

(M. Sc. Bombay ’69) 

[Hiot. Probability 

1 f.r 2 2p xy y*_ l 

“2(1-p 2 ) \ g^~ a, a 2 + a a 2 | 


=r r _—l_ 

Jo Jo 2na 1 G 2 V{ I -P 1 )} 


dt dy 


but 


x 


=u, 


— V 


o,\/{2U-PT} ’ 

=> dx=a x v '{2 (1 -P-)} da, dy=c 2 v '{2 (1-P 2 )} dv 

Hence integral = f f dy 

77 Jo Jo 


but 

Integral 


u=*r cos 0 , v=r sin 0 => du dv=r dd dr 


v'{(i-p=)} r n 


n 


oo 


(r 2 —2pr 2 sin 0 cos 0)^ 


Vio-f) r n 


.1 


0 2 (1 — P sin 20) 

f’ /2 1 


dO 


-?’■)} r 

^ Jo 


_.V / {(1 -P-) | > ' 2 _sec 

77 J o 1 -Man 2 0 — 2 r' 

_V{(\ -P») 


sin- t'+cos- 0— p 2 sin 0 cos 0 

2 0 


•, dO 


tan 0 


-dO 


,7T 


i 


50 dz 

0 1 -j-r 2 —2 P2 


, z tan 0 


2n [ lan “* v7f(t-P’)} ]„ 
= 2T[ r - /2_!an '’ ] 

= 2 ~ £ w/2-f tan" 1 


] 


v'i(l-P s )} J’ 

3. Show that curves, along which density function is cons^ 


Regression and Correlation , Curve Fitting 


tant for a bivariate normal distribution, are the ellipses 


a* oia a ^ c 3 2 

For these ellipses show that the line of regression of y on x 
is conjugate to the axis of y and the line of regression of .x on y is 
conjugate to the axis of x. (B.Sc. cal) ’68] 

4. Obtain the regression equation of y on .x if their joint 
distribution has the frequency function 

/ (•*, , 0< x, y < 00 

(B. Sc. Nagpur ’65) 

5. Show that if x t and x a are standaid normal variates with 

correlation coefficient p, then the correlation between x x 2 and 
x» 2 is p 2 . (M.Sc. Patna ’69) 

[Hint, ra.g f. -e* (V+2<W,+tf) 

+ i (^i 2 +2p /i^+^2 2 )+i (f 8 4-2p t l to-\-t 2 -) 2 f2 


E (x/ x 2 t ) = ^ r i=coeffi. of 


ti r '/ 
rlsl 


„ . , , E (x x 2 x<?)—E (xF) E (x, 2 ) 

Required correlation (£(*««)-/: 2 (r 8 *)}] 

~VU3-I)t3-1)> ,J 

6 . If x and y are normally correlated variables with mean 
zero and variances a x 2 and a 2 2 respectively, show that 


_ x i / y px\ 

w ~7 l ’ z_ V0-p“)U / 


are independent normal variates with mean zero and variance 
unity. IB. Sc. (cal) ’66] 


[Hint. 


3 (w, z) _ 1_ 

d (*. VR 1 —P 2 )} 


1 /x 2 2p 

ann »v 2 +z 2 = /1 I—— 

( 1 — P*) \<Ti 2 O 


P xy t y 2 

" I" o 

(7*2 


Hence 


1-P 2 )} ' 


1 fx a 
2 (l-p») \gS 


x a 2 p xy y 

G X 2 GjCTo 


vil 


dx dy 


Since 


dx dy= *l [*'■ mv Jz, 
a (w, z) 



388 


Mathematical Statistics 


_ .L (,v«+ r 2) 

dP= — e dw dz 

'Z7( 


1 


-T” 2 


■ v (2>r)[ 


-i z- 


dz ] 


V(^) 

7. The variables .v and _>* with zero means and standered 
deviations and a 2 are normally correlated with correlation co¬ 
efficient p. Show that u and v defined as 


and 


x y 
u= — 4 . - 

Gj G a 

V=— —— 
a l a 2 


are independent normal variates with variances 2(l-fp) and 


2 (1 — p) respectively. 

[Hint. As in example 5. 

1 


dx dv 


[M.A. (slat) Delhi ’67] 
du dv 

d ill, v)l 




I 


o a 

1 


du dv= a ^~du dv 


A1 w a +v a x* y 
Also 

2 a,* 1 * 

Hence 


°2 I 
2 




u 8 —v 2 2x_)' 

2 GjGa 


c/p = 


1 


4rr V((l-P a ) 


1 


1 f m 8 4- v 2 


(l# a -V* 


I 


du dv 


u 


V(2~) V(2(l + P)} 


1 


2 


>2 


2 (1—PJ 2 


*]• 


V(2^) V{2(I-p)} e 
8. If .y and y are standard normal variates will coefficient of 
correlation p, show that 

(i) regression of y on .v is linear 

(ii) x +>• and x—y are independently distributed 


(iii) </> = 


x* — 2?xa+ v* . 


(l-P 5 *) 


is distributed like a chi-equare i.e. as 



Regression and Correlation , Curve Fitting 


389 


that of the sum of squares of standard normal variates. 

[M.A. (stat), Delhi ’59] 
9. A bivariate distribution in two discrete variables, * and y 
is defined by the probability generating function 

Exp. [a ( u-l)+b (v -\) + c (w-1) (v—1)] 
Simultaneous probability of x=r, y=s where r and s are 
integers being the coefficient of u r v*. Find the correlation co¬ 
efficient of x and y. IM Sc. Poona ’64] 


r—e 


r, 


[Hint. Put u=e , 

. a ( e l — 1)4- b (/’-!)+<:( e'‘-l)(e' 3 -l) 
f 2 )~ e 

=a (a 4-1) 


M (t v 


a 2 a/ 




/«=0 


bM\ 


E ^)=iU (l =o = 6 - £ ^>=l?U,=o = A (6+,) 

aw 

dr 1 dt i / x =/.=o ab+c% 

E (xy)—E (x) E (>’) 


E(xy) 


Hence r= - ltl - r 


y/[{E (jc*)-£* (x)} [E O' 2 )-* 2 0’)}J 

ab+c—ab c 


] 


V(a6) y/{ob) 

10 From a standard bivariate normal population a sample 
of n values (x x , j> x ); (x 2 , y 2 ),..., (*n. >\») * s drawn Show that the 
distribution of 

2 X = — 27x 2 and z 2 =-j-£y 2 
n n 

has the m.g.f. 

U , 2/j\/ . 2/ a \ 4 P 2 

cons -*H'- ttA *-»)- # f 

[M.A. Patna ’65; M.A. Delhi ’64] 
11. Obtain the regression curve for the mean cf y for a given 
x for the distribution 

rt x _9 0 ^ x, y < oo. 

J(x>y) 2 («+x) 4 (i+y)* 

Also show that the marginal distribution of x is 
dP= l ■ 4^. dx, 0 < * < co. 


[ 


(1+x) 


Hint. h(y 


[M A. Delhi 62] 



390 


Mathematical Statistics 


12 


8 


f oo 

. 


9 (• 4-.X+.V) 


' \ \ ~ \ rr _ f 

2 (1+*)*(!+>•)< ^ 


' fo (H-*) 4 'f(l+w 3+ (l + v)‘]‘* : 

9 r i x _\°° 

5 J. 


2 ( 1 +*)* [ 
--2-. f± 

2 (I — *>*| 2 


* 0-' 

' HT'-cSS^f * 


2(1 +y)* 3(1 +y) J0 

x 13 2*4-3 

3 J 4 *(!+*,« 


+ 


1 


(2.x + 3) 
6 


n 


[ •* E ( Y i *)=J >' h (>’ i *) Jy] 


+ 


XV 


(I+jO 8 (l +y) 

l 




4? 


(2a* + 3) f. L(ll+jO* <!+>•)» ) + * (( 14-jO 3 


6 


t.N 


Ht 

+J'" T » +>’7* 2(1 4-j) 2 ^3(14-^)3 ] 0 


+ 


1 

‘ (1 +)’)' 

4- - V 
s4- 


rK L +f-fl 


(2*4- 

= 4*4-3 1 
2-V4-3J 

Show that for the bivariate normal distribution 
dP= cons, exp ^ (*>-2p xy+y') J dx dy 

m. g. f., 

M ('» '.)=«P [J (/ 1 2 +2pV a + / 1 s)] 

moments obey the recurrence relation 

M „ = (r +J _l) ? „ r _„ j-i+(r —1 )(.t— 1)(1 — pt) 

Hence or otherwise, show that 

fi rJ = 0 if r j-.v is odd 

/u 3l =2p, /i 39 -=l-f?p a [M. A. Madrass 66) 

13 Explain the ideas of marginal distributions, conditional 

distributions and the curves or regression, wth respect to a biva¬ 
riate distnbunon. Show how to find them from a given bivariate 
irequency density function /(*, v) 


(i) 


(ii) 



Regression and Correlation, Curve F.tting 


391 


If dF (x, y) = (4/5) ( x+3y) dx dy ( x ^ 0, y ^ 0. obtain 

the equation of the curve of regressin of yon x. 

14. Given /(x, y) — 2/a z , 0 < x < a 

0 < y < a 

as the joint probabi'ity function of two variables a* and y , find (a) 

the marginal distributions (b) the mean and variance of each of 

the marginal distributions, fc) the equations of regression curves 

of y on x, and x on y, and (d) the correlation coefficient between 
x and y. 


15. x x and x 2 are independent normal variates with means 

Mi, M 2 aud s d’s a 3 respectively. Obtain the distribution of 
(i) x x —x % , (ii) x x +x 2 

16. If x and y have the bivariate normal distribution with 
means nv n 3 and variance g 2 1 , o* 2 and correlation coefficients ?, 

u= x SZfl+y=JL> and 

(J“ (Jo (Jo 


are independently normally distributed with zero means and 
variances 2(l+p) and 2 (i —p) 

Find the value of /i 22 f° r the given bivariate nonrmal distri¬ 
bution. 


17. (a) Let (Z; Y) be a two-dimensional random variable and 
suppose that 


E{X) = Mx 9 E(Y) = m»> V(X)=e-x, V{Y)=o- v 


Show that if the regression of Z on X is linear 


cov (X .)’) 

V{y^)V y Y) 


£(F | x)=jij -f- p ~(x—ii X ) 

'Tx- 

and if the regresion of ^ on Y is linear 

E (Y | y)=ti*+p — (y—nii) 

Gy 

(b) Assume that E[Y I x)= —fx—2 and E (X | y) = — \y —3 

(i) Determine the correlation coefficient p 

(ii) Determine E (X) and E(Y). 

9.9 Principle of Least squares. Suppose we are given m equations in 
n unknown in matrix notation as follows : 



a.ji . a Jn 


a~ i Oi 2 . a Jn 

• •• . • • •••••• ••• 

••• ••• .«•••• ••• 

a,ni a in 2 . Uiiti 


• 

-V, 

• 

~b t “ 




bn 


• • • 


• • • 


• # • 


• • • 

1 1 


_ h n _ 



I 






392 


Mathematical Statistics 


Here a's and b’s are constants and x’s are uuknown we can 
write (1) simply as 

AX = B 

If m=n y the solutions are unique, but if m > n the values of 

x are uot uniquely determined. In the case m > n y we determine 

the values of x according to the principle of Least squares. It 

states that the values of x should be so chosen as to make the 
expression 

m n 

S= J ( a iJ x J-bj)\j=l, 2 .«... (2) 

the minimum. The valus of x so determined are called the best or 
most plausible values in the least squares sense. 5 is minimum when 

ds ™ n 

ter jf i ai,(a,J y=l,2,...,«. ...(3) 

These // equations are called the normal equations correspon¬ 
ding to equations (1). 

Ex. Find the most plausible values of x, y and z from the 
following equations 

x— y+2z=3, 3x+2v— 5r —5, 4x-f->>-Mz=21, 

—x + 3y+3z= 14. [B.A. Puajab 56] 

In matrix notations the equations can be written as 


1 -I 2 


X 


3 

3 2 —5 


y 


5 

4 1 4 




1 

21 

- 1 3 3 



* 

14 


Here 


s ~ ( x ~ }’ + 2z — 3) 2 -f-(3.v-J-2r — 5z — 5)- + (4x-f y-f- iz-21)* 

~H— x-f-3.v-f-3 j — 14) a 

I he three normal equations are obtained from 

r'.v c )s d < 

= —* - = 0 


?.v ? v Pz 


i.e. 


(x—.v+2z — 3) + 3 (3x-f2>’—5z—5) f 4 (4 .r-f -_» -f 4r — 21) 

- I (-- x -f- 3y l-3.v —14) =0 


Regression and Correlation, Curve Fitting 


393 


-l.(x-jr+2z-3)+2 (3*+2y-5z- 5 ) 4-2 (4x+y+4z-2\) 

+3(-*+3jM-3z-14)~'\ 
-2 (x—y+2\—3)+( — 5)(3x+2y—5z—5)+4(4x+y-\-4z—2l) 

+ 3(—*+3>>+3z—14) = 0 

These respectively simplify into 

27x+6.y=88, 6x+15^+z=70, >-+54z = 107 

Solutions of these equations lead to 

x=2**704 >>=35508, z=l*9157 

Ex. 1. Form normal equations and hence find the most plau¬ 
sible values of x and y from the following : 

x+y =3 01, 2x—>>=0 03, x+2y=l 03, 3*+>>=4*97 

[M. Sc. Agra ’61] 

[Ans. *=0*999, >>=2*004] 

Ex. 2. Find the most plausible solutions of the following 
equations : 

(i) x+>>=301, 2x—>>=0 03, x+3>>=7*03: 3x+>>=4 97 

[Agra M. Sc. ’67] 

[Ans. x=0**997, y=2 0004] 

(ii) x-\-2y-\-z=\ t 2x+y+z=4, — x +y + 2z=3, 

4x+2>>-5z=-7 [Punjab M A '45] 
[Ans. *=1*16, >>=—0*76, z=2 80] 

Ex. 3. Three independent measurements on each of the angles 
A, Bj C of a triangle are as follows : 

ABC 
39*5 60*3 80*1 

3>*3 62*2 80 3 

30*6 69 1 *04 

Total 118 4 182 6 240 8 

Obtain the best estimate of the three angles taking into account 
the relation that sum of the angles is equal to 180. 

[I. A. S . ’58] 

(Hint. Let the three observations on A, B, C be 
denoted by x u x 2l x 3 ; yu >*a» y* * 2 . *3 and let the best 
estimates of A, B, C be 0 lt 9 2 , 18 0 —0, —0 2 Make 
3 

*S= E (*/-0,)*+27 O/-0 a ) a +27 {Zi + 0^02-180)- 

a minimum. The normal equations are (dS/dflt) — (dS/dOJ^ 

Ans. 5, = 39*27, £,=60*66, <T 3 = 80 0V] 

9*10 Fitting of Polynomials We have seen that tb„* line of regres¬ 
sion of y on * give the best representation of the behaviour of y 
with change of x. But often, it is apparent from the data that the 



394 


Mathematical Statistics 


regression of y cnx is far from linear. In such a case we assume 
that the curve 

k 

y—Z b s x\ where b 's are constants. ...(1) 

is the curve of *best fit\ Here b’s are unknown and their values 
are to be so determined as to make (1) fit as closely as possible 
the given points (x 1# y x ). (x a , (x n , y n ). Let the frequency of 

the pair (*/, y t ), be f\ and let 


n 

N=E f 

i=l 

Suppose that t’.'.e value of y corresponding to determined 
from (1) is taken to be Y it Y, is known as the expected value of the 
observed value y t . The sum of the squares of the deviations of 
the observed values V/ from their expected values Yi , i = l, 2, ...» n 
is denoted by 

Ns'=*2f, {yi-YiY ...(2) 

i 


where Yi = b 0 +b x Xi + b* x 2 i6* x t k - 

We have to choose the AH-1 coefficients b 0 , b\ f ...» bk so that 
the sum (2/ is minimum, and this is done by equating to zero the 
partial derivatives of N s 2 with respect to these coefficients, we thus 
obtain (A:-f- l) normal equations 




i 


A x'i (y t -2 7 b s x'i) 


s' 



i= 1 , 2 n; s<= 0 , 1 , 2 , ..., k 

When written separately, these are 

2 f O’l-bo-bt x,-...-b k x k i) = 0, when 5 = 0 

l 

2fi Xi (y l — b 0 — b l x i — ... — bkX k i) = 0, when 5=1.. (3) 
/ 


2fx*i (yi-bo-b,x,— ... — b k Xi k j = 0, when s=k 
Th.se equations are called the normal equationv for fitting the 
parabolic curve (I) of degree k. We can determine (ArH-1) unknown 
b s from equations (3). 

In particular, when Ar= 1 , and/,= 1 for every/, the normal 



Regression and Correlation, Curve Fitting 


395 


equations for fitting the straight line y=bt+biX are 

2 )’i=b 0 n+b t £ xi ...( 4 ) 

/ i 

2 x i yi=b ti £ x i +b 1 £ x-i 

When &=2,/ = l for every /, the normal equations for fitting the 
second degree parabola y=b 0 +b 1 x+b 2 x* are 

£ y t =nb 0 +bi 2 Xi + b 2 £ x 2 i 

£ x, yt = b 0 xi+b t £ x‘ t +b t 2 jc 3 , ...(5) 

£ x 2 i y,=b 0 £ x°-, + b 1 £ x\+b 3 £ a\ 

Note. When the values of x { correspond to equil increments, 
h, and the distribution of// is symmetrical about x, the equations 
(3} may be much simplified. 

Suppose first that n is odd. Let n = 2m -\-1 and let x's be 

x, x+h x + 2x+2mh. 

We transform the random variable X to U by the relalation 

y_ X—{x+mh \ 

h 

where x+mli is the middle value. 

Thus U : —m, —(m— 1), —1,0, I, (in — 1), m. 

Hence, owing to the symmetry of distribution o if ’s, the sums 
of odd powers i e. 

£ U=0=*£U 3 ^£ U 5 =... 

are all zero. The sums of even powers may be written down from 
tables. 


Suppose second that n is even. Let n—2m and let x’s be 
x, x+h, x-{-2h. ..., x-\-(2m— \)h . Here choose the new ran¬ 
dom variable U as 

X | '*+(m— !)/») + ( x+ mh) | 

U hh 

, x-\-(m — ])h+(x+mh) . . .... 

where -^-is the mean of the middle pairs 

of values. 

Thus 

U : — (2m — 1), —3, —1, 1, 3, ..., (2m— 1) 

and again the sums of odd powers of i/'s : t. 

£ U=2 U*=£ U’° are all zero. 


Ex. 1. The weiht of a calf taken at weekly intervals are given 
below. Fit a straight line using the method of least squares, and 


396 


Mathematical Statistics 


calculate the average rate of growth for weak. 

Age (. x) 

123456789 10 

weihgt ( 3 ’) 

52*5 58*7 65 0 70-2 75 4 81*1 87*2 95*5 102*2 106*4 

(B A. Hod’s Delhi ’56, B. A. Madras ’45] 
Here n=10 and we choose the new random variable as 

U= -^-[note h—\, in this example] 

2 

= 2 (A'—5*5) or U=2X-\\ 

Thus 

V : -9, -7, —5. -3, —1, 1, 3, 5, 7, 9 
The straight line to be fitted is 

y = b Q + b l u 
The normal equations are 

2 y=10b o +b 1 Z u 
Z uy=b Q Z. n+6, X u 2 
The calculation gives 

1 >- = 796*2, v uy = 1016*8. S u 2 = 330, Z «=0 
Thus / 96 2=) 06 0 , 1016*8=330 6, 
where /»„ = 79 62, A, = 3 018 appro.*. 

The regression equation of y on x is 

V= b 0 -\-bi (2x— I 1) 

= 79*62+3 08 (2.v —II) 
or >*=79*62-1-6-16.v. 

The average rate of growth per we;k is 

6 16 units. 

Ex. 2. Fit a parabolic curve of regression of y on x to the 
seven pairs of values 

.y: 10 1*5 2*0 2*5 3 0 3 5 4 0 

y : 1*1 1*3 1*6 2 0 2-7 3*4 4*1 

[M. Sc. Agra ’54] 

Here /i =* 7 and the common increment h of x’s is *5. 

Take the new variable 

t =- T_rL. or U= 2.V—5. 

•o 

Thus U : -3, -2, — I, 0, 1, 2, 3 
The parabola to be tilted is 

y—b c -\-h 1 u-\-b i u 2 



Regression and Correlation-, Curve Fining 


397 


The normal equations are 

Z yz=7b 0 -\-b 1 Zu+b 2 
2uy=b 0 Z“+bi Zu- + b s Zu 3 
2,u*y=b 0 Zu*+b 2u*+b Zu 4 . 

Obviously 2^ a =Zu 3 =0. The table showing necessary calcu 
lations is given below : 


X 

u 

y 

u 2 

« 4 

uy 

u 2 y 


1 0 

-3 

i-i 

9 

81 

- 3-3 

9-9 


1-5 

-2 

1-3 

4 

16 

-2 6 

5-2 


2-0 

— 1 

1-6 

1 

1 

-1 6 i 

1*6 


2-5 

0 

2-0 

0 

0 

| 

— 


30 

1 

2-7 

1 

1 

2-7 

2-7 


3*5 

2 

3-4 

4 

16 

6-8 

13.6 


4-0 

3 

4-1 

9 

81 

12-3 

36-9 


Totals 

— 

16-2 

28 

196 

14-3 

69-9 



The normal equations become 

I6‘2=7b 0 + 28bz, 14-3=28 b lf 69-9=286 0 +14. 

which give immediately 

^ 0 =2-07, ^=0-511, =0 061 

The non-linear regression equation is therefore 

^=2 07+0 51 1m+0 061m 2 
=207+0-511 (2*-5) + 0 06! (2x-5) 2 

which simplifies to 

y= 1-.04—0‘20*+0-24x*. 

Exercises 

1. (a) Fit a straight line to the following data regarding x as 
the independent variable. 








398 Mathematical Statistics 

x : 0 1 2 3 4 

y: 1 1*8 3-3 4 5 6*3 

[Ans. r=0 72-f l*33x] [M. Sc. Agra *49] 

(b) Fit a straight line to the following data : 

x: 0 5 10 15 20 25 

y : 12 15 17 22 24 30. 

[Ans. j’=0 69-v-j-11-30] [B. Sc. Agra *63] 

2. Fit a least square parabola of second degree to the data 
given in example 1 (a) above. 

Find out the difference between the actual value of y and the 
value of y obtained from the fitted curve when x = 2. 

[Ans. r=l-48+1*13 (x —2)+0-55 (x—2) 2 ; difference = —0*18] 

[B. Sc Agra *56] 

3. Fit a s cond decree parabola to the following data regar¬ 
ding x as the independent variable. 

x : 1 2 3 4 5 6 7 89 

y : 2 6 7 8 10 11 11 10 9 

[Ans. y— — 1 -f-3 55x—*27x 2 ] (M. Sc Agra *53] 

4. Tne profits, £100 y, of certain company in the xth year 
of its life are given by 

x: 1 2 3 4-5 

.v : 25 2' 33 39 46 

Taking w=x—3 and v=> —33, find the parabolic regression 
of v on u in the form 

v=a-\-bu + eu~ 

[Ans. y = —0 086 + 5*3i/+0 643w 2 ] M. Sc. Agra ’52] 

5. The profit of a certain company in the xth year of its life 
arc given by 

x : 1 2 3 4 5 

y: 1250 1400 1650 1950 2300 

Taking u—x —3 and v=—— ^ show that the parabola of 

the second degree of v on u is 

v + 0 - 086 = 5 - 30 m + 0 - 643 w * 

and deduce that the parabolo of second degree of y on x is 

,•= 1 140-F72—xd-32'15x 2 . [B. A. Hon’s Delhi *54] 

6. For a given bivariate distribution, find the straight line 
for which the sum of squares of the normal deviations is a mini¬ 
mum. 



Regression and Correlation ; Curve Fitting 


399 


[Hint. Take the straight line x cos a-f-.v sin a —p = 0. Then 

NS 2 =£f (xi cos a-f)/ sin a - p) 2 in which y. and p 
are constants to be determined The normal equations are 

£/ ( }’i COS a — Xi sin a) (x, cos a-f r, sin a— />)*=0 
27 fx (*i cos a ±y t sin a —= 0 
From the second, x cos a+>’ sin a — p = i) 

The first simplifies into 

(cjs* a —sin 2 a) 27//XO, + sin a cos a 27 f 0’, f — X?) 

-f -p (sin a v Xi —COS a 27 J,0=0 

Let x=J’=0, then p= 0 and 

cos 2«*/x n -fsin a cos a (cry 2 —a x - 2 ) = 0 

2fx.i 1 

J 


tan 2a= 


a*- 2 —a v 2 


7. What is meant by the method of least squares ? Explain 
how would you use it to find the constants a, b and c, of a second 
degree parabola 

y=a+bx-\- cx 2 

to be fitted to a given data. [B. Sc Agra *63] 

9'10. Related regressipn. The use of least-squares method to 
non-linear relations requires a good deal of computational effort. 
But, in some cases, it is possible to convert a non-linear relation 
into a linear relation. The following cases may be noted. 

(i) The power function y=ax b 
converts into log >> = log a-\-b log a* 

or Y—ho+bxX [T=log y,)X=\og .v] 

(li) The exponential function y=ab x 
converts into log _y«=log a-\-x log b 

(iii) The hyperbole y=a-\-b/x 
can be put ai Y=a-\-bX \ [y= Y\, \fx=sX] 

X 

(iv) lhe equation y= , 

be put in the form \iy=b-\-fyx 
or (I /y\Y, l/x=X\ 


\ 
- \ 


\ 


can 


Ex. Write the normal equations tot fitting the curve 


-- •■•mi w|uuiivuj i'ok imma mw vui vc f)\ — rv 

where k is a constant, p and v arc the pie$5ure\and volume of a 
gas. Fit this curve to the following sbt of observations taking p 
to be the independent variable. 

p (Kg./cm 2 ) 0 5 1*0 1*5 2 0 2*5 3 0 

v (Litres) 1520 1C00 750 620 5.0 460 

[B. Sc Agra ’oO] 




— k 


/ 



4C0 


Mathematical Statistics 


[Hint. pv m =k transforms into 

logjo v=~- log 10 logic P 

which is of the form y=a-\-bx. Now proceed as usual. 

Ans. pv 1 * 7 =85110] 

Ex. 1. Fit the curve y=ae bx to the following data, e being 
Napairian base, 2*71828 : 

xi 0 2 4 

y : 5-012 10 3162 

[Ans. v=4*642 e°' ibx } [B. Sc. Kanpur ’68] 

Ex. 2. (a) Derive the least square equations for lilting a curve 
of the type (i) y=ax + bx~ l 9 (ii) y=ax b to a set of n points. 

(b) It is known that the readings for x and y given below 
should follow a law of the form y=ax+bx~ x when a and b are 
constants. 

a : : 1 2 3 4 5 6 7 8 

y: 5 43 6 28 8*23 10*32 1263 14 86 17*27 19 51 

Use the method of least squares to find the least values of a 

and b. LI. S. I. *68] 

Ans. 0=2 40, 6=3 03] 

Ex. 3. Fit the curve y=at bx to the following data : 
x: 1 2 3 4 5 6 7 8 

15 3 20 5 274 36 6 49*1 65 6 87*8 117*6 

[Ans. ^=11*58 e- * 898 *] 

Ex. 4. For the data given below find the equation to the 
best fitting exponential curve of the form y=at bx . 

x : 1 2 3 4 5 6 

y: 1 6 4*5 13*8 40*5 125-0 300 

[B. Sc. Agra’ 62] 


io 

multiple and partial correlation 

10 1. Introductory. Yule’s notation. 

It often happens that the vaules of a variable are influenced 
not only by a single variable, but by those of several others e.g. 
the yield of grain is affected by the amounts of different fertilizers 
used. It we have to asesss the combined influence of a group of 
variables upon a variable outside that group, th n our study is 
that of multiple regression and multiple correlation If, however, 
we wish to examine the effect of one variable upon the second, 
when the effects of all the others have been eliminated, our tudy 
is that of partial correlation. The analysis involved becomes very 
simple and compact on using Yule’s notation. 

Let us consider the variables x x and * 2 with Bx x ) = E(x t )=0 
and V(xi)=o 1 *, F(.y 2 )=ct 2 2 . The lines of regression of x x on x., 
and of x s on x x are denoted by 

x x — b x2 x 2 , x 2 —b iX x x 

where t„. * >. b Cov (*, £> 

squares. The estimate of x x is provided by b X9 x 2 and that of jc 2 
by b 2l x t . The differences x x —b 12 x 2 and x 2 —b u x x are called 
residuals , or errors of estimate of the variables. These are 
expressed by 

*vt = x x b^oX 2 t Xt i=Xi — b 2X x x 

Far determining h 12 and b» u the sums Sx\. 9 and Ex 1 ,, must 
be least. This leads to the two normal equations 

S (x x -b u x 2 )=0 (V E{x x )= 0=> 2^=0 & 

E(x 2 «=0=j«- L** = 0) 

2-Xg (Jfj b x2 x%) =0 

in one case, and 

(*2 ^21 v x ) = 0 

Ex x (at 2 ^2i^fi)=0 

in the other. The above normal equations can be expressed in 
compact notation, as 



402 


Mathematical Statistics 


£X]. t —0, S^i'i—0 
27-y 2 !=0, £vi at b . 1 =0 

The summation includes all pairs of values of the distribu¬ 
tion. Obviously 


h io— 




12— T*~ 2 » 


Ux 2 
Cov (x l% x t ) 


b ix = 


2 


UXiX 9 

27*,* 

Cov (*,. * a ) . a a 2 

-_TS 


<*1 


2 


=i 2 . 


ao 


The coefficient of correlation r 12 or r 2l is given by 

r i2 2 = ^i2 ^21 


and 



—■ r2i 


r x2 having the same sign as /> 12 or Z> 21 , which is the sign of Sx x x % . 
It may be noted that though r, a =r 21 , b x2 =j£b 9x . The sums of the 
squares of the deviations of the points fiom the lines of regression 

x 1 = b 12 x 2 and x<=b 2l x x are given by 

Zx x 2 *=Nc* (1 — r la *), Zx 2x *=Nf7.f (l-r 21 *) 


we define 


<*i •? 3== ^ _ ~* 2 i -2 = ^ 1 2 (1 — r l2 2 ) 

£x 2 n — af (1 r 2l 2 ) 

a,. 2 and To.j are the standard deviations of the variables *i. a and 
x 2 . x and are referred as of the first order. The order is determined 
by the number ol subscripts following the point and are called 
secondary subscripts. The primary subscripts are those preceding 
the point. 

The regression of .Yx on x 2 is said to be linear, if the mean of 
each array of a/s is on the line of regression 

*l = r l2 (°l/ a 2) '*2- 

Distribution of three or more variables 

The best estimate of xi is given by X v where X x is determin¬ 
able from the plane of regiession of x x on x 2 and .y 3 , that is 

A'i = o + f>i 2 . 3 Vo -}-bi 3 . 2 .Y 3 

where we assume, 27.Vx = 2\v a —27 v 3 = 0. The constants a;b x » 3 , 
/?i 3 . 2 are chofen so as to make 

27 (*i— X i)' ie. 27 (x 9 —a - b x »- 3 ** ^13 2 *sV 
a minimum. The subscripts to the coefficients are written down 


Multiple and Partial Correlation 


403 


on the following principle. The first is that of the variable for 
which the estimate is being found, and the second is that of the 
variable which the coefficient multiplies. These are primary 
subscripts. Separated from them by a point are the subscripts 
of the other variables that enter into the equation. The.e are 
secondary subscripts, and their number determines the order of 
the regression coefficient. 

The normal equations corresponding to the regression equa¬ 
tion 


X l t=z U-\~b l 2. z Xo ~f“ 6, 3 . 2 *8 

obtained by minimising S=27 (x x -a-b x ,. 3 .v„-6, 3 .o x 3 V are 

dS 

= 27 (x x a 6 , 2.3 * 2 — 613.0 x 3 )=0 


da 
3S 


^133 
dS 


= 2 X 2 ( x x 


A 


1 / J 

db 13 o == ^- ,x *( Xl ~ a 6 , 2*3 x x — 6 , 3 « x^)j= 0 


27*,=27 x 2 =27 x 3 =0 => a=0 
and the regression equation of x x on .x - 2 an* 

X\ — b \ 3 3 A2”b6,3.2 x^ 

Thus the normal equations are 
2 *1.23=0, S x 2 x x 23 = 0, Ey { x x 03=0 
in which at *.23 defined by 

Akl-23 X x 6,2 fXn — 6,3. oi-X' 


is sLbipl; 


A 


\i 


=x x — X£ (X x being the estimat^fof x) from the 
' / / regression equation/of jr, on X* and at 8 ). 

is the residual, ojr error of estimate of froim. ad" regression 
equation (1). Sitae/fc^.^O, the S.D. t^Jf$i(|i£l ; 

given by xsf / / / / \ \ X. 

, T 1 ** 1 / / : . \\y (2 ) 

the summati^fcoverjrig the whole ydistribut/bn, whdse\otal fre. 
quency is 

Similarly! the/regression equation otyx 2 on x, and x 3 i 

hy / / x 2 =bti s x x +y ti . x x 3 ^ 

and the obrrfs finding error of estimate is 

Xg 13 = ^9 6 ji,j Xj— 633 .i X 2 

and the ViorilW^equations are 

X x AT 2 .13=0 = 27 X 3 X 2 . X 3 
ft may be noted 6, 2 . 3 ^ 6 21 . 3 . 



404 


Mathematical Statistics 


From the above, it is evident that the sum of the products of 
corresponding values of a variable and a residual is zero , when the 
subscript of the variable is included among thn secondary subscripts 
of the residual , the summation covering the whole distribution. 

Further. 


£ Xj <23 Xj. a — FI X|. 2 a (Xj ^i 2 * a J — £ *i 23 

and £ Xi» 2 3 Xj . 33 =£ Xj 23 - ^12 3 ^is*i ^s) = ^ri 8 

Thus the sum of the products of two residuals in unaltered by 
omitting from one of the factors any secondary subscrtpts which are 
common to both 

We may note 2 x r a 3 x ,. 3 = £ x x 23 (x t x —b n x 3 )=0 
Thus the sum of the products of two residuals is zero if all the 
the subscripts of the one are included among the secondary subs¬ 
cripts of the other. 


10 2. Equation of regression plane, regression coefficients. 

The regression equation 

X\ — b l2 3 X'a4"^13 > 2 *3 •••(0 

is the plane of regression of x 8 on x 2 and x 3 . The normal equa¬ 
tions when written in full if 

£ x t x t 23=0 and £ x 3 x ,., 8 =0 
£ XjXi — ^ 11.3 £ —^i 3»2 £ x 2 x 3 =0 

£ XjX 3 ^ 12-3 £ x 2 x 3 — bi 3.2 £ x 3 2 =0 
or Oj(j 2 r u b 12 « oy —^13 2 a a °a r 23 = 0 ) 

a i<^2 r u b i>- 3 <7 2 o 3 r 22 b\%. 2 ffi* = .,.(2) 

where r l2 is the total correlation between x, and x 2 , obtained by 
ignoring the values of x 3 ; similarly, r 23 is the tojijl correlation of 

x 2 and x 3 , and so on. 




The equations (2) arranged in the unknot.a'u ., 2.3 and b 12 .2 


^12*3 <*2 + ^13.2 °3 r 23— C \ °13 


b \ 2 3 <? 3 r 2 3 4 -^l 3 2 ^3 

yield by Cramer's rule 


b\ 2 3 — 


o\ r 


1 ' 13 


} 


H 


...( 3 ) 


l l l r Vi 

O3 ^23 

• 

c 2 

r 2 3 

a l r l 3 

CT 3 1 

a2 ^23 

<*3 


I- 


* 

IS 



r ia 

'*3 

• 

1 

r 9 3 

°* 

''is 

1 

• 

'23 

1 


4 


1 Z 


Multiple and Partial Correlation 


405 


where A <y is the cofactor of the elements in the ith row and 
the jth column of the determinant 


Note : r,isr M *r M al 


A = 

l 

r i2 

'’IS 


r 21 

1 

'"23 


r 3l 

r 32 

1 

Similarly 

> 6 13 . 2 = — 

a J_ A 13 
O 3 Jn 


22 


33 


Thus the equation of the regression plane is 


„ _ <»1 4l2 ^13 v 

2 "» *v g ”* »» 

<r 2 iAj! a 3 Ao 


or 


or 


*i 


11 +-^^12 
<7 2 

+ ^ A 

<Ta 

*1_ 

* 2 . 

*3_ 

CT 1 

a a 

°3 


1 

r 23 

'’si 

r 32 

1 


0 


...(4) 


If the origin is not necessarily of the mean, the equation of 
the plane of regression is 


*i—*i 




r 21 


\ 



*«— 

a 2 

1 

r 32 


Xn — X- 


S3 


= 0 


r y/ r 32 1 

Since, we/navc ) 

27 * 2 . 3 Xj -23 = 0 

Or *2*3 (*1-"-6 j 2 .3 *3 ~ 6| 3 . a * 3 ) == 0 

or / 2 x v3 x 1 —b l2 3 2 *a. 3 * 2=0 


2 *2 3 * 


2 * 2 . 


/Thi 



...(5) 


12 a 2 X 2 3 *2 2 *2*3 *2 » 3 

COV ( *! 3 *2 3 ) 

^ (* 23 ) \\^ 

This leads us to conclude that 6 12 . 3 is the coefficient of regre¬ 
ssion of *!. 3 on * 2 . 3 . Similarly, 6 M . a is the coefficient of regression 
0 FX 2.3 * 1 * 3 * The coefficient of correlation between .v I<3 and 

* 2.3 is defined as 

r Z i2. 3 = ^i 2.3 621*3 



...( 6 ) 



406 


Mathematical Statistics 


Variance of a Residual. 

We have A^a 2 1 . 23 =L x 2 1 . 23 =£ x x Xj. 23 

(-Vi — b\ 0.3 Xo 6j 3 2 * 3 ) 

or N o 2 i.^=N a, 2 —JV Z> l2 . 3 c^^rio—N b l3 2 

or c x | 1 ^r ~) == ^i , a 3 G 2 r i 2 ^ 13-2 a 3 r i 3 

Also (3) are rjo ^i=^i 2*3 g 2 “i - Z j i3 2 ^23 °3 

r \3 c l~bi3‘3 G 2 'iS + ^lg 2 G 3 

Eliminating the b’s between these equations, we get 


=0 (Note r 12 = r ai , r 13 = r 3 i> 


or 


1 - 

G ‘"l 23 
G l 2 

r Xi 

r l3 

r 2 l 


1 

r 23 

' r 31 

1 


r 3 2 

1 

1 

r 13 

r 13 


'ai 

1 

r 23 

m 

1 

'’si 

r 32 

1 

1 

I 


+ 


g2 1*83 


0 

0 


12 


1 


32 


ns 

r 23 

1 


=0 


or 


A- 


1*23 


J „=0 


g2 1. 2 3 = G 1 2 


Ai ...(7) 

Cor. 1. a* i a =:ai 3 (l-r 13 »), G 2 . 3 2 = G a 2 (1-'«*•) 

Cor. 2. i, -Vj .3 Vg 3 =^ x x (^ 2 — b 23 rj) 

or b 12 3 C 2 o.3 = />i3 CT 2 2 ^13 g “3-^23 

•=b l2 a‘ a — b\ 3<J 3 • £|3 — 9 / Y ^23 = ^32 ~ 7 2 \ 

G 3 \ G 3/ 

or ^ia -3 G a“ (1 ^' 23 ) = <y "a (^i8~^i3 Z'at) 

b\ 2 — Z’j 3 Z^o 

(M A. Bombay ’56} 


Z’ia 3 


1 — Z>j 3 Z>3 3 

10.3 Multiple Correlation. 

The correlation between .Vj and its estimate X x obtained 
from the regression equation x x —b Vi 3 x 2 +Z> I3 . 8 x 3 is called the 
corfficitnt of multiple correlation between x x and two variables x 2 
and x-j and is denoted bo /? l(33 ,. 


Multiple and Partial Correlation 


407 


It is determined as follows. 

We have A r i=6 12 .3X'<)-j _ ^i3*2 *3 

= *1 — *1 23 

Correlation between x t and its estimate X L is given by 
r - s x ' 

,(,3) V(£* 2 , syr*j ...(1) 

Now .y i . 23 ) = Sx 1 j —£ *V »3 (since Sr 1 jr ,. f3 

S.Vi.13 ATj.23) 

= # (<7*!—C* 1# 2 3 ) 

Also SAV=E (.Vx—Xj-aa)* 

“S -V“x 2s A x Xx. t3 + S X"i-23 

= S V-SV23 

= N (CTJ^-O*,.,,) 


Hence „ 3) = 

CT 1 V (<*1 —a 1 ,.23) \ <r,“ / 


i/» 


or \—R z im) = 


ff2 l-23 


<*1 


2 


_A 


The values of A and A u are 


...( 2 ) 


A — 1 ~\~2 r 13 r 23 r 31 r - 12 r -. 3 r * 13 


J n — 1 — r 2 23 
Thus (2) reduces to 

^ Z l<23) = 


r *i a + r* n ““ i©r o',r - 


13_ 

o 

1 — r- 


12 23* 31 

23 


...(3) 


which expresses the multiple correlation in terms of the 
total correlation between the pairs of variables. 

Cor. 1. Since 'Zx l X 1 = N (a 2 ! — n 2 V n 3 )=X X t * 

BDd R ' “ a ~ vV*? X ZM ‘ thc imp,ic ^d 

R l (23 ) can not be negative, 

Cor. 2 Ri (*a>= 1 a 1 th^t^Il the deviations 

are zero and x L is accuratel^given by^fhe regression equation. In 
this case x x is linear function of^and x 3 . , 

Cor. 3. R x (*»)= 1 => r ! x,-|-r*j^'jKf'*i 3 -53f 1 > r 3 J r 3l =l 

This is in fact the nc/essary and sufficient condition that the 
three regression planes (njhe case oftrfariatedist ribwlon conicide. 
The regression planes of x 2 on x/and x Jt and of x 3 on x t and x , 
arc 


Aa 


-V 


21_A 8l +^d aa + --ii-A2 3 =0 

o 1 O 2 o 3 


*1 


X » A I X 3 


All+—*32 + 


33 


33 


0 



408 


Mathematical Statistics 


10 4. Partial Correlation. 

The correlation between x V3 and x a . 3 is called the coefficient 
of partial correlation between x x and x 2 and is denoted by r Xi 3 . 
Since x x 3 =x 1 — b l3 x~ i , it is deviation of x x from its estimate b u x 3 
and we may regard it as that part of the variable x x which 

remains after the influence of .y 3 has been eliminated. A similar 
interpretation can be given to x 2 . 3 . 

The partial corielation r I2 3 between x x and x 2 is calculated 

on the assumption that the variables x x and jy 2 are influenced only 

by each other, and not by any other variable. It is determined as 
follows. 

0=2 -V 2 . 3 £ x 29 (.Yj b Vi . 3 x 2 -b 13 . 2 A a ) 

— 2 X x X 2 - 3 — b\ 2 . 3 £ ,Y a . 3 Xn 
£ Xf 3 X». 3 b j 2 . 3 £x 2 3 ,t 2 . 3 


^ 13-3 — 


£ V,.., Xo. 


-( 1 ) 

Thus 2 3 is the coefficient of regression of X x . z on x 2 . 3 . Similarly, 
bi x . 3 is the coeflicient of regression of .Y 2 . 3 on x x . 9 and its value is 

£ X 2 . 3 x t : 

r , J ' -(2) 
By detinition 

Cov (-Yj. 3 , .v a 3 ) _ £ Xi 3 x« 3 


L ■*- -*2-3 ~*l 3 

^ai-a— — 

*- A j. a 


ri ‘‘ 3 VW^i-a). *Wa)} v\2x\ 3 . £**« 3 ) ...(3) 


or 


or 


2 — A A __ ^2 / G 2 *^2l\ 

12 J — 1>122 t) 2l . 3 -— — . I—- -- V 

C?2 tiji V 


r ‘\ 2-3 — 




12 


r l 23 — “ 


dll *d‘22 

Since r 12 . 3 has the same sign as b x2 3 which is the sign of — J JSf 
we have 

An __ r ii r »a r . i3 _ 

\ (^n^aa) V''{(* — '■"13XI — r / aa )} ,..( 5 ) 

Cor. 1 . Since r 2 lz 3 ^ 1, we have 

(r o —r, 3 r 23 y ^ (1—r* J3 ) (|— r 2 23 )- 
that is, r* l2 -{-r i l3 +r- 23 — 2r r .r, 3 r 23 j 

If r L . t r l3 are assumed known, then 

r -a < - r *v-r* l9 +r\ 2 r 8 ja ) ...(6> 

Cor. 2. Since a V3 and a,. 3 , are the standard delations of v, 3 
and \ z j, the coefficient of partial correlation is connected with 


Multiple and Partia I Correlation 


409 


the coefficient of partial regression by the formulae. 


A _ r a l*3 L „ C 2*3 

^13*3 'l2»3 » O 213 3 - 

a 2-3 3 

Since 


...(7) 


r 12-3 


23 ^32 

a l-3_ a 2 /rjo — 


( 8 ) 


0 . 3 <t, ( '”l-rV J 1 (•’• bil ~ r ^ ST) ... 

Cor. 3. Reduction formula for the order of a standard deviation 

E X 2 i. 23 =S *1.2 *1 23 = ^ *1.2 (*1—^12 3 *2 — ^13 2 *3) 

= S * 2 1>2 b J 3 »2 D *1 2 *3-2 
=> CX 2 1 23 — G 2 ! 2 (1—^13 2 ^31 2) 


( . #, _ 2 *1-2 r 3 2\ 

*• L * 2 r 2 ) 


or ° 2 r23=o 2 i.o(l-r 2 13 . 2 ) ...(9) 

Since r 2 ia *3 ^ 1» <* 1.23 ^ 2 ...(10) 

Thus the estimate of x x from * 2 and x 3 fr in general better than 
the estimate from * 2 alone. 


Further (9) yields 

CT i 2 (l-**ift3))-ffi # 3 ) 

=> 1 — ^ 2 i«83)=(1— '■ 2 i 2 )(1 — r 2 13 . 2 ) 

=> 1 —/? Z 1{23) ^ 1— r 2 ]2 

or P\to)>r\ 2 or & US9 ) > r 2 13 ...(II) 

7/1?K 23 ) « zero, 60//1 r l2 and r 13 must be zero , aw/ .*! /$ then 
tin cor related with either x 2 or x a . 

Ex. Define multiple and partial oorrelation cojficients. For a 
trivariate distribution , i/ioiv that 

l-/? 2 K23)=(l-r 2 ia )(l-/- 2 l8 . 2 ) 

Deduce R x{2 3 ) > 'la (B. Sc. Agra ’67) 


Solved Examples 
Ex 1. Prove the identity 

blZ 3 ^23*1 ^31.2 = r l2 3 r 23 1 r 31 2 

X *1-3 Vo 3 _ <*l 3 


^12*3 — 


E**« 


<T« 


2 3 


Similarly ^aa’i— r 23 1 “ *• (fei-a— *31 2 

a 3 l <*i 2 

New use 

°13 = °1 V0—f 3 l3 ) , 0*1=*0% V((l —r 2 Vi ), (7a.2 = <T 3 v/(I-'- 2 3 3 ) 

^3 i = <w/(l—^la). <*i 2 = Oi-v/(l—r* ia ), a 2 . a = 0 „ V(1 —r 2 aa ) 



410 


Mathematical Statistics 


and the result is obtained on multiplication and cancellation. 

Ex. 2. A number of persons are measured for their heights x, 
weight y and chest expansions z and product moment correlation 
coeficients are calculated. Prove that 

rxv+r Vz -\-r zx ^ —3/2. [Delhi M-A, ’56] 


Let 

x=y=z =0 

Then 

e\ —+-^-+—l 2 ^ 0 

L c ’» Gy G 2 J 


£ [ S * 2 +2 S ^ > 
L G X G x a v J 

or 

2 E{X P+ 2 X E(X »> 0 

G* a Gx-Gy 

or 

3 -f 2r acy -}- 2r tf , -|- 2r. r ^ 0 


^ rxy-j-r V y-j-r zx ^ —3/2 

Ex. 3. Prove the formula : 

^12 a + — £32 1 

i ^13-2 b 3i .t 


(fl) 


12 


(b) 


_ fri 3 r i 3 a r r2 -i 

F i o- - 


12 V{l-r J 23 l )(l-rV)} 

The right hand side of (a) 

3 Ajj 


f _ a l ^13 03 A ja \ II J _ 04 

\ -^ll G 3 An Gj» d 3 3 f/\ G a 


Go ^33 

_ Q I f J a8 «^i:t dia ^33 1 

ct 2 LAi« ^33 — ^13 ^31 ! 


13 


11 


NOvV 


13 "31 

^32 —( 1)(^23 r 2 i r U») 

^, 3 =r 2 ir 33 — r 3l , d, a = ( — 1 ) (r n —r 3l r. i3 ) 

^ 33 = 1 — f 2 i 2 . ^ 3 i =/ ia t^—r i2 . dii=l— r 2 33 
It is easily verified 

•^32 A 13 — d|od 33 = r 13 [2r ia r 13 r 32 — r 2 13 — r* 23 — r 2 ia -f- 1 ] 
^ 11 ^ 33 —Ji3^3i= 1 — r 2 Vi — r 2 n 3 - r 2 13 4-2r 12 ro3r 31 


Hence R.H.S = -*- r 

On 


12 


bij = L HS. 


A 

^33 


1 .) 

VI ' 


Interchanging 1 and 2 in £ 13 we get 

b*l 3+^3 1 ^31 3 

— - T -7-- 

1 i Dwi 

Thus / “u = b\o b^i 

_ b\2 ;t b« I -3"f~ ^23-1 ^31«2 -3 ^31*2 ^ 32* 1^3-1 

(I ^3l*i»)(I ^33*1 ^32-3 

— ^ 12 3+ 2r^ 3 r 2 3 | **31.2"4~f 2 13'2 ^ 2 32*1 

(l-^ 2 23 i)(I-r* M .*) 


Multiple and Partial Correlation 


411 


^'l2‘3 - l -r 13"2 r 32*l 

12 _ VU1-^3 1>(1-' 2 13 2 ) 

Ex. 4. Find the formula which may express a partial coefficient 
of order n—2 in terms of multiple correlation coefficients of orders 
n— 1 and n—2. 

We have 


1 — R 2 1 (23***r») — 0 — / ’ 2 12)(1 — r3 13's)”'( 1 —f 2 ln*23*** n ri — ) 

I-**j) 


Dividing we get 

1 2 

r lti’23 •• 



1 — R~Ui3‘'-n ) _ 

1 —/? 2 l(23-**n = l) 


Ex. 5 (a) Show that for given r 12 and r 13 r 23 must lie in the range 

r 12 r 13lb(l — r 2 V > — ^ 2 13 + /’ 2 l2 r2 13) 1 ' 2 

[Poona 65, Delhi M.A. ’541 

(b) If r Vi —k, r.n =—k t show that r 12 will lie between —1 and 
l —2k 2 . [Lucknow ’64] 

(a) r 2 12 . 3 ^ 1 > (r Vi —r 13 r 23 ) 2 ^ (1 — r 2 i 3 }(i — r 2 23 ) ^ 1 
=> r 2 23 —2r 12 r 13 r 23 -j-r 2 i 2 +r 2 13 —1 ^ 0 => r 23 
must be in the range 

r iz r i3i(l —'■"i2~“^ 2 i3+' ,2 ia r2 i2) 1/2 


since if x 2 — 7ax+b 2 ^ 0, then 

(x—a) 2 < a 2 — b 2 or | x—a | < V (a 2 —b 2 ) 
i. e. a—(a 2 —b 2 ) < x < a-\-(a 2 —b 2 ) 

(b) We have 

r 2 i 3 +2A: 2 r 13 -f- {2k 2 — 1) ^ 0 

=> (ri3+* 2 ) 2 ^ (1 — A: 2 ) 2 

=> I r 13 +* 2 | ^ 1 — k 2 
=> A: 2 -I < r 13 +A: 2 < 1—A: 2 

=> —1 < r l3 < 1 -k 2 . 

Ex. 6. If x lt x 2 end x 3 are three variates measured from their 
respective means as origin and if e± is the expected value of x, for 
given values of x 2 and x 3 from the linear regression of x 2 and .v 3 , 
prove that 

cov {x u e,) = var (e L )=var (x t )—var (x t -e t ) 

[I. S. I Cal ’56] 

We have ei=b Vi 3 x 2 -f6 13 « x-j-Xi — x,.-, 
cov (xi, e l )=E(x l e 1 )=*E[x l (v,- a'i. S3 ) 



412 


Mathematical Statistics 


= Na 2 i — No\. 23= 2 V(cr*i—o 2 1 . a3 ) 
var (e x )=.E [x x —x x23 ] 2 

= 2s [X 2 x —2* x Xi.23 + JC*l*23] 

=--£ (jcM- 2E (x 2 ! aaJ+fi’Cx 2 ! 83 ) 

= £■ (x* x ) —£* (x 2 x . 23 ) = N(a 2 x — aa*i. 33 ) 
var (x x ) —var (x x —* x )=E(x 2 i) — £(x x —e x ) 2 

= E{x\)-E{x\ 23 ) 

«= CT 2 ! - 0* X 03) 

Thus all the relations are satisfied. 

Ex. 7. Ifxi=yi+y>, x % =y t +y 3t x 3 =y 3 +y l 
w here y u y 2 , >3 are uncorrelated variables and each of which has 
zero m<.an and unit s, d. find the multiple correlation coefficient bet¬ 
ween x x and two variables x> and x 3 . [Delhi M.A. *61[ 

We have by hypothesis 

E 0 ’ x )=E (y»)=E (y 3 ) = 0 
E (>-i 2 ) = 1. E (v x >’ 2 )=0 etc. 

r E ( Xl 
13 “ y/{b (X x 2 ) E(XS) 

where E (x x x z )=E {(y x +y>) O’-’+J'a)} 

=E [>-, y 2 +y x yz+y-i 2 +y% v 3 ] 

= 1 

E (x l 2 )=E {(v l +y 2 Y}=E [v x 2 + 2 y x v 2 +; a »] = 2 
Thus r x2 = i 

Similarly because of symmetry r. 3 = .}, r ax =*. 

r 2 i:> + r V»~2 r X2 r 23 r 3x 


Hence 


/Pitta) 


1 — r^o 3 

=(i + i-2)/(l-i) = i=> /?ir,3)=l/x/3. 

Ex. 8. If the relation axi+bx 2 + cx 3 =0 holds for all sets of 
values of. v x , .\ x and x 3 , what must be the partial correlations ? 

c'*a 3 — 0 2 a 1 2 — 6 3 a 2 2 

[Cal. M A.] 

h 


Also prove that r vl — 


We have x x = — -7— x a - 

b a 

^ __ a c 


lab (7i <t 3 
c 


a - Vi_ T Y3 


Hence r , - 2X . 3 — b 2X , 3 ^ X3 . 3 — 1 


A b a c 

a _ a h — c 
O21 3-O23.1-j 


Since 6 21 . 3 = ro X . 3 —the sign of r 13 . 3 is the same as that of 

°1 -3 

b‘±\»3 

Thus r. t 3 = — 1. Similarly r u . 2 =r 3 2 *i s * 


-l 


Multiple and Partial Correlation 


413 


Also axi f bx 2 = — cx 3 => 6 2 j \‘i 2 -\-b 2 x> 2 + abx x x» =>c 2 x 3 2 
=> q 2 «7i*+6 2 ct 2 2 4 " 2 o^ rioc 1 ao = c 2 er 3 , since £x l =£ x 2 

= 27x 3 =0 


2 _ 2 
(To 


C 2 g 3 2 —O 2 «T! 2 - />_ 

12 2 00 <7i <x 2 

Ex 9. S/row that if x 3 =ax x +b. r», //?^ r//re? partial correlations 
are numer'cally equal to unity, r 13 2 having the sign of a , r 23 .i the 
sign of b and r 12 . 3 the opposite sign of alb. 

x 3 —aX\-\-bx 2 => b 3 \. 2 =a, b^-i—b 


1 b 
*1—— *2 


. b _ 1 

^12 3-“—. *»•* _ — 


1 * A 1 A _ " 

X 2 =-£-X 3 —-£X l => b 2 3 1 = 0 21 . 3 —- 

Now ^ 19 . 3 6 2 i. 3 ='- 2 1 a. 3 = 1. Similarly r 2 13 . 2 =l, r 2 23 .i=l 

a=b 13 . 2 =ri 3 . 2 n - L1 - => r 13 . 2 has the same sign as a 

o 3 . 2 

1 =^ 23 . 1 = r 83 . 1 ^3 => r 2 M has the same sing as b 


a 


-— = ^i 2 - 3 = r i 2-3 => a has the sign opposite of 


b <y 3 .i 

b M _ 

— — Oi 2 .3 —r I2 . 3 

^2*3 

Ex. 10. Show that correlation coefficient between the residuals 
Xj .23 and x 2 . X 3 is equal and opposite to that between x 1>3 and x 2 . 3 . 

[M A. Delhi ’56; I. C. A. R. 51J 

Coefficient of correlation between x x . 23 and x 2 . 13 


— 27 X1.33X2.j8 j/<* 1- 2 3 ct 2* 


13 


1 


— 27 x 2 .| 3 (X|— b l2 . 3 6i 3 . 2 x 3 )/'j|.o 3 a 2 .j 3 

= Jq~ (—^12-3 2 X 2 x 2 . 13 )/rr 1 . 83 a 2 .i 3 
=I= ~~b l 23 

<72*13 


— — b to, 


12*3 


a i*23 


jo 


J< 


? JO--*- :i JO 

L a 2 . 3 _ Cov (X, 3 , X 2 . 3 ) . 

1/f n.o 



414 


Mathematical Statistics 


= — {coefif. of correlation between jr lt3 and x 2<3 }. 

Ex. 11. //r 23 =l, prove that 

(0 r 2 l 2 = /’ 2 i3 

(//) 23 = cr 1 * (1 — r 2 l2 ) 

We have 1 --K 2 2 (3 i) = (l-r* 23 )(l 

r 23-l => ^2(31)—1 

Hence 0* 2 . 3 x= o 2 2 (1 — J? 2 S(31,) = 0 

which requires that all the deviations x 2 . 31 are zero so that x 2 is 
given accurately by the regression equation Hence from the result, 

r ‘i 2 +' ,3 j3 + ^ 2 23 — 2 r 12 r 13 r 23 = 1, 
we have (r 12 -r 13 )3=0 -> r* ia «r’ 13 


Also r 


r 12 t'\Z r 23 


= 0, since r 23 = 1, r 12 =r 13 


1 “ #8 v{l“W{(l-r* s )) 

Hence o* 123 = o 1 2 (1— r* 12 )(l -r* u 2 ) => o 2 1 . 23 =ffi 2 (l-r 2 l2 ) 

Ex. 12. If r» 3 = 0 , prove- f//o/ 

(0 ^ 2 i(:3)='' ? i 2 + '’’“is 

We have ^ , 1( ., 3) = / ^ 2 + r2l3 ~ 2r|2r2 3 r 3i 


i-/- ;, 23 


r 23 


^ =>■ ^“l(23) = / ‘*12 + '' 2 13 
G ‘‘l-23 = G 1 2 (1 r 2 is)( 1 — r 2 x3- 2 ) 

DfT l 2 (A-^"l(2 3 )) 

= G x 2 (1 -r* 19 -r\ 3 ) 

Ex. 13. Prove the follow ing relations : 

(0 27 A 1* 23^1-23 == 0 

( H) 2xf^=Z\-i 23H“^j2-3^' V l^‘a“f"^l3 j27.tiX 3 

/.•■/v n» (^12 a^A^.Yj-|-fci 3 . 2 27^ 1 -V 3 ) 

(III) i\ 1( »3) =-^r. - --• 

W c have < > 1 23 ~ bii j.v 2 -f Au.jX 3 

Hence 27 -Vi.o 3 f^ia a-^a4"^i3-2-Va) = 0 

Also ei.g 3 = Xi .Yj.aa => Xi = ei. 23 -|-.Vx.j 3 

•*. 2/ Xi 2 =£c 2 i-23~\~ 27-v*i. 23 

Also Ex 1 e , i. 23 =27e 2 123 

"•VlCi. a 3 = 27At (/’ 12 . 3 .Y 2 + ^ 13 - 3 ^ 3 ) 

=s ^i*-a27.ViA , g-f- ^i 3 . 2 27 aiX 3 . 

Hence 27 vi 2 =^ j \*- 1 . 23 -|-6i 3 3 27A‘i.x a -b^i3 > a 27 .VxAT 3 

^A-x^.^V _27.Viej.33 

/<-l{23>- ^ v —- 

, ^12*3 S-Vj-Yo ^i 3 .2 S-Vi-Va 


Multiple and Partial Correlation 


415 


Ex. 14. If the correlation coefficients of zero order in a set of 
p variates were equal to p, show that 

( a ) every partial correlation of the sth order is ?/(l+s?): and 
{h) the coefficients of multiple correlation R of a variate with 
the ether (p — l) variates is given by 


(l-R 2 ) = (l-p) 


[l + U>-l)p] 


1 +(/>-2) p * 

[Delhi M. A. ’57; Bombay 58; Nagpur ’53; I.C.A R ’47] 
Given r /; =p, t,j= 1, 2,..., p; iz£j. 

, &jj±h, i,j, /> = 1,2, ... p. 

P 


r "“ Vkl-f'iAliVUl-'■*/»)> 

p-p.p _p (1 — p) 


VU-P*V(i-P*) ( i p 2 ) J+p‘ 

Thus every partial correlation of order one is 


l+P 


A partial correlation of order m -\-1 is expressible in terms of 
those of order m by an equalion of the same form as 

__ **ia r 2a _ 

,2 ' 3_ v/til-'^Xl-^sa) • 

Thus with a set (A;) of secondary subscripts added to each 
coefficient wc have the formula. 


__ r U tn — rurt k) r/h.(k) 

".«»> V(l _ rVi 


and 


nj.ii(k)= 


l + P I+p l +p 


A 


)J( 


o * 




l + 2p 


(I +p) a / a/ \ U+p) 2 

Now assuming that every partial correlation coefficient of 


order s is 
would be 


1+JP 


, the partial correlation coefficient of order j+1 


Infer 


(1+Jp) 


.]/( 


.2 


1 


(l+*?) 2 J 1 +(v+ 0 P * 


Now 

where A= 


l-R 2 = 


1 

P 


P 

1 


P 

P 

1 


. pth order deteitninant. 



416 


Mathematical Statistics 


and 

Hence 


=['+(/>—1)P][ I-p]*" 1 

^ 2 i=ll+(^- 2 ) P ][l-p]r -2 
l-/?2=(l_ p ) P 

1 p) l-Hp- 2 )p 


Exercises 


1. Prove that 

ri 3 . 3 ^-i 3 =^L ( Cij-'n'aA 
*2 3 c 2 \ 1 —r 23 2 ) 

2. Show that ai ' 23 ct ^3» == _4£.v?». 

r i2-3 d J2 

3. Is it possible to get the following from a set of experi¬ 
mental data ? 

(a) r 23 =0 8, r 31 =—0-5, r 12 = 0 6 

(b) r 23 =0 7, r 3l = 0-4, r ia =0-6 [B. Sc. Agra *61] 

[Ans. No) 

4. For a trisariate distribution, show that 

1 +2r la r l8 r 2 3>r Ia >+r 13 2+r 2 g* 

[B. Sc. (Hod's) Poona ’60] 

5. If r l2 =-f 0-80, r Jtf = —0*40, r 23 =-0-56, find the values of 

'-12-3* r i 3 -2 and r 23 .j. [M. A ( Stat ) De |hi ’65J 

[Ans. r ia . 3 = 0-759, r 13 . 2 = 0*097, r 23 1 = -0*436] 

6 . Show that the regression equation of .v, on .v 2 and x s can 
be obtained in the form 


where d l% d 2 
row in die d 



and d 3 are the cofactors of the elements of 


eierminant 


the first 



r l 2 

1 


r l3 

r 23 


r !3 1 


r’s being the correlation coefficients and S 's are the S. d's. 

7. In a trivariate distribution, 

^1 ~ * = O '3 = 3 

'-.> = 0-7, r 23 =r JI = 0 5. 

F.nd (i) r 23 .,; (jl ) 


Multiple and Partial Correlation 


417 


(iii) b 13*3* ^Ja 2 an d (iv) <jj 33 [B Sc. Bombay ’68] 

[Ans (i) 0 2425 (u) 0 52 (iii) . -4, 0*1333, (iv) 13064 

8. Show that the values; 

^ia = f» ^£3 = & an d r 3i = — £ 

are not consistent. [B. Sc. Mysore ’64] 

9. Prove that 


^ 2 1(23) — ^12 3^12 ~~ "4" ^13 2 r i3 * 3 

CTj CT X 

10. If r ia =r 83 =r 3l = p^l; then 


[B. Sc. Lucknow ’66] 


r ia*3— r 2 3 j—raj. 


and 


•^1(23)— ^2aa> = 7? 


3(12) 


'+? 
Pv 2 


v'(H-P) 

[B. Sc. Sardar Patel ’68] 

II. Explain the terms (1) partial correlation (2) multiple 
correlation, and (3) total correlation in tri vauate distribution. 
Show that 


(|) 2 / : a i2-^ r2 i3-2r 1 ar g3 r3 1 

1 23 I _ r l 

1 r 23 


(2) oVw-a,* (1 —r* la ) fl-i-Va). 

12. Explain ‘partial correlation’ in a trivariate distribution. 
Prove that 


r __ ^Ifl r l3 r 2 3_ 

1,3 V{(1 -r* i9 )W{ 1 -r* t3 )>* 

If r i*=A and r 2 3 “=—A show that r, 3 will lie between - 1 and 
-2A*. 

13. Explain the concepts of partial correlation and multiple 
correlation for three variables x 2 and x 3 . Obtain the expression 

for partial correlation coefficient r, 2 . 8 in terms of ordinary corre¬ 
lation coefficients. 

Deduce the relation between r J2 s and the corresponding 
partial regression coefficients. 


14. Define the partial and multiple correlation coefficients 
for a p-variate distribution. 

If A *=*[A/y] is the variance-covariance matrix. Show that 

0 ) -, 

v (A n A a) 


( 2 ) 



418 


Mathematical Statistics 


Prove that Pi ( 23 —0 P 12 —— Pjp—0, Explain 

the significance of this result. 

15. If x v x« and x 3 represent three characteristics of an 
individual, define the partial and multiple correlations and explain 
what they are supposed to measure. 

If x % and x 3 have the same correlation p among them¬ 
selves pairwise, and if /? 1<5 » 3 ) is the multiple correlation coefficient 
in the usual notation, show that 



2p» 

1+P* 


Show further that in such a case the partial correlation coe¬ 


fficient is 



P 

1 + P 


Deduce that if 9>-h then /? 1 ( 23 )>r 18 . a 


11 

THEORY OF BTTRIBUTES 

111. Attributes. 

Co-education is being carried on, in India, with great success. 
The boys and girls take their classes together down right from 
their kindergartens upto postgraduate classes. K. G. K. College, 
Moradabad is one of the institutions which imparts co-education. 
In 1971-1972 about 2000 students were studying different courses. 
Now we divide or classify the family of 2000 students into classes 
such that one class contains boy students whereas the other 
contains girl students. Denote the quality (or characteristic) of 
the student being ‘boy* by A. Then A is termed as an attribute 
and the absence of A is denoted by the Greek letter a. We have 
now the following definition. 

Assume that N is the total number of observations. Let the 
family of W observations be divided into several classifications (or 
classes or subclasses) which describe qualities (or characteristics) 
of individuals or objects. Then they are termed as attributes. 

11*2. Dichotomy. 

Definition. Dichotomy is a process, by virtue of which a 
family (or class) of individuals may be divided into two sub¬ 
families (or subclasses) according to whether they do or do not 
possess a particular attribute. 

Notations The capitals A, B, C,... are used to denote the 
several attributes. An object or individual possessing the attribute 
A is named simply A. The family (or class), all the members of 
which possess the attribute A is named the family {or class) A. 
Greek letters a, p, y,... are used to denote the absence of the 
attributes A, B, C,... 

Illustrations (i) Assume A denotes the attribute honesty. 
Then a represents dishonesty. 

(ii) If B represents the attribute sight, then p represents blind¬ 
ness. 

(iii) If C stands for deafness , y stands for hearing. 

Conclusion. In general, ‘a 9 is equivalent to “not —A”, or an 

object or individual not possessing the attribute A\ the family a is 



2 Mathematical Statistics 

equivalent to the family none of the members of which possesses the 
attribute A. 

11*3. Combination of Attributes. 

Assume that A stands for beauty and B for honesty. Then 
AB denotes the combination of beauty and honesty. Note that 
the presence and absence of these attributes give rise to four 
subclasses, viz, AB , Afr, *B, af! where 

AB includes the beautiful and honest, 

Afi includes the beautiful and dishonest, 

«B includes the ugly and honest, and 
aj8 includes the ugly and dishonest. 

If we consider a third attribute deafness, denoted by C. Then 
ABC includes those who are beautiful, honest and deaf where 
xBC includes those who are ugly, honest and deaf. 

Definition. A class symbol is a letter or a combination of 
letters like A y AB, a.B, a|3, by virtue of which, we can specify the 
character of the members of the class. 

The * class frequency' or ‘frequency’ of the class is defined to 
be the number of observations assigned to the class. Class frequ¬ 
encies are generally denoted by enclosing the corresponding 
class symbols in brackets. For example, ( A) stands for the number 
of 4\ that is, the elements possessing the attribute A, and ( ABy ) 
denotes the number of ABy, that is, objects possessing either A or 
B but not C 

114. Order of classes and class-frequencies. 

Definition A class specified by m attributes defines a class 
of the mth order, and its frequency defines a frequency of the mth 

order. For example : AB, AC, BC are classes of the second order, 
whereas (A), (afi), (ABC) and (ABCS) are respectively the class 
Irequcncies of the first, second, third and fourth order. 

The table to express the frequencies for the case of thiee 
attributes is as follows (note that N denotes the whole number of 
ob rvations) : 


Order 0 

N 



Order 1 

04) 

(B) 

(O 


(a) 

(P> 

(y) 

Order 2 

(AB) 

(AC) 

(BC) 


(<4P) 

(Ay) 

(By) 


(afi) 

(«C) 

(PC) 


(*P) 

(ay) 

(fry) 



Theory of Attributes 


3 


Order 3 (ABC) (a BC) 

(A By) (a By) 

(AfiC) (apC) 

(Afiy) (apy) 

From the table it is evident that 27 is the total number of 
frequencies. 

Proposition 1. Continued dichotomy according to n attributes 
gives rise to 3 " classes. [(Jtkal 1967J 

Proof To prove this fact, consider the number of classes of 

different orders. 

Of order 0 there is one class N. 
of order 1 there are 2 n classes. 

Reason : Classes of order 1 contain only one symbol, and 
each of the n attributes contributes two symbols one of the type 


A and one of the type a. 

Order 2 there are 



x 2 2 classes. 


Argument. Each class contains two symbols. Hence two 
attributes out of n can be selected in | ) ways, and each pair 

generates 2 2 different frequencies of the types (AB\ (Ap), (a B) 
and (a/3). 

A similar argument will show that of order j thereare 
^ j ^ x2* classes. 

Consequently, the total number of class frequencies is 


l+n.2+(" )2‘ + ... 

Which, in view of the binomial expansion, is equivalent to 

(l+2)*=3". 

This completes the proof. 

Proposition 2. Any class-frequency can always be expressed 

in terms of higher order 
Proof. Evidently, 

because the total number of observations is equal to the number 
of A'b added to the number of a*s. 

The fact that the number of A’s equals the number of A* s 
which are B's added to the number of a*s which are S’s implies 
that 


Similarly, 


(A)=(AB)+(Afi) 
(AB) = (ABC)+(ABy) 



and so on. 


...( 3 ) 



4 


Mathematical Statistics 


This completes the proof. 

Definition. The frequencies of the highest order are termed 
the ultimate class-frequencies. 

Proposition 3. Every class-frequency can be expressed as the 
sum of certain of the ultim ate ciass frequencies. 

Proof. Assume that A, B, C are three attributes. Then 

(A) = (AB)+(A(3) 

=(ABC)-j-{ABy) + (AfiC) + (A$y). 

Proposition 4. 2 n is the number of ultimate class frequencies . 

Proof. Assume that there are n symbols in any class-fre¬ 
quency of highest order. Now each letter with respect to a 
particular attribute may be written i.i two ways : A or a, B or (3, 
C or y, etc. This implies that the total number of possible 
symbols is 

2x2x2 x2x2x 2 ...n times = 2". 

The number of ultimate class-frequencies is then 1". 

Deduction. The 3 n frequencies may all expressed in terms of 
the 2 n ultimate frequencies. The reader will convince himself 
with verifying this fact. 

Caution The ultimate frequencies are, however, not the 
only set which specify tne whole of the data. 

115 Definition. Assume that we have a set of the ultimate fre¬ 
quencies such that they are 2 n in number and they are algebrai¬ 
cally independent ; equivalently, when they are written symboli¬ 
cally, none can be expressible in terms of some or all of the 

others. Then such a set ol frequencies is called a fundamental 
set. 

Positive Attributes. The attributes represented by capitals A, 

• •. are named as positive attributes whereas their 

< ontrories , denoted by Greek letters a, (3, y, .are termed as 

negative attributes. Note that the class is positive if a class- 
symbol includes only capital letters and the class is negative if 
the class-symbol consists of only Greek letters. Hence A, AB, 
ABC, are positive classes whereas a, a/3 a fiy are negative ones. 

Meaning of A.N. We define A N as an operation of dichoto¬ 
mising .\ according to A. Then we write 

A N=(A) 

which is a symbolic way of saying that we will obtain a class- 
frequency while dichotomising N according to A. Similarly we 




Theory of Attributes 


5 


a N=( a) 

Then 

which yields : 

(A+<x).N=N, 

implying that 

A -{-«= 1 • 

Conclusion. Replacement of by l —A or ot A by 1 — a is 
justified. 

Proposition 5. (a) (a^))^^—(/D—(5)-f(^#)« 

(b) (apy)=N—(A)—(B) — (C)-\-(AB)-\-(AC)+(BC)—(ABC). 

Proof. By definition, of <oP), we have 

(<x/3) = a&.AT 

whieh, in view of a=l — A, becomes 

(a/3) = (l—/*)(1—£)•# 

= (\-A-B + AB).N 
= N—A.N—B.N-\-AB.N 
= N-(A)-(B) + (AB). 

This proves (a). 

Again, («^y)=(aPy).^ 

=(1— AX\-B)(l-y).N 
= (1 _ a-B-C+AB+AC+BC-ABC).N 
c=N-{A)-(B)-{C)-\-(AB)A-{AC) + {BC)-(ABC) 

This proves (b). 

Proposition 6. For n attributes A, B, C, . A^» 

( ABC . M) > (^l)-f(B)+(C) + ... + (A/)-(«-l) W, 

where N is the total frequency. (Agra 1972, 69) 

Proof, (by induction). As a matter of fact, 

(ap)=a^.N=(\-A)n-B).N 

=N-(A)-(B)+(AB) 

Recall that no class-frequency can be negative. Hence 

• «/3) > 0 => N-(A)-(B)+(aB) > 0 

=> ( AB) ^ (A) + (B)-N -' 1 > 

Replacement of B by BC in (1) leads to obtain 

{ABC) > {A) + {BC)-N, 

which, in view of the application of (1), assumes the torm 

(ABC) > (A)+{B)+(C)—2N -(2) 

Now we wish to demonstrate that the proposition must hold 
for m-H attributes if it holds for m attribute Let us assume that 

the proposition is true for m attributes. Then 

(ABC . R) > (i4)+(*)+(0 + ...+U9-(w—1) N ...(3) 






6 


Mathematical Statistics 


Writing RS for R in (3) furnishes : 

(ABC...RS) > M)+(S)+(C)+... + (/{S)-(ot-1) N, 
which, by dint of (1) takes the form 

( ABC...RS) > (A)+(B)+(C) + ...+(R)+(S)-mN 
1 his shows that the proposition is true for m+ 1 attributes. The 
relation (2) indicates that the proposition is true for n= 2, and 
hence it is true for n = 3, and so on. This completes the proof. 

Proposition 7. If A occurs in a larger proportion of the cases 
h lere B is than where B is not , then B will occur in a larger propor¬ 
tion of the cases where A is than A is not , /. e. 

(AB)/(B) > M0)/(0) => (AB)I(A) > (*£)/(«) 

„ (Agra 1968) 

Proof. Our hyphothesis reads that 

(AB}/(B) > (Afi)m, 

which gives : 

(L) . (Ap) 

(B) ^ (AB) 

so that 

i + ffi) = I . WL (P)+(B) ^ (ap)+(ab ) 

(B) ^'(AB) (B) (AB) 

^ fA)-h (oc) ^ (B) __(AB)-\-( a B) 

(B) (AB) (A) (AB) (AB) 

=* i + 

l A > + (AB) (A) > (AB) 

(AB) (*B) . 

JJ) > proving the proposition. 

Illostrative Examples 

1 (°) tf M)=(«)=(fi)=(0)=|AT, 
then (AB)=( a p,), (Afi)=(„B). 

(6)//(^)=(c.) = (B) =( P) =(C)=(y)= j Ar 

onrf (ABC)=Wv), then 

2 {ABC)=(AB)+(BC)+{AC)-\N 

For, (a) Recall that 

(«P)=JV-(A)-(B)+(AB), 

which, in view of our hypothesis that ( A)=(B)=IN yields • 

which reduces to 

(a fi)=(AB). 

As a matter of fact, 

(a 2?)=(a) —(<*0) 


Theory of Attributes 


1 


=<«)-r(P 

wnich reduces to 
fb) Recall that 

<afly)^N-<A)-(B)-(,C) + (AB)+(BC) + (AC\-(ABC) 

By hypothesis, 

(>10C)=(aPy), (A) = (B)=(C)=\N. 

Then we have 

(A BC) = N —* N — IN— hN + ( A B )+ (BC) 4- (A C)—( A BC), 

which simplifies to . 

2 (ABC)=(AB)+(BC)+(AC)-\N. 

2. fn a free vote in the House of Commons, 600 members voted. 
300 Government members representing English Constituencies (inclu¬ 
ding Welsh) voted in favour of motion. 25 Opposition members repre¬ 
senting Scottish constituencies voted against the motion. The Govern¬ 
ment mojority among those who voted was 06. 135 of the members 
voting represented Scottish constituencies. 18 Government members 
voted against the motion 102 Scottish members voted in favour of the 
motion The motion was carried by 310 votes. Analyse the voting 
according to the nationality of the constituencies and party. 

Denote government, voting for the motion and English mem¬ 
bership by A , B, C. Then the problem supplies us with the follow¬ 
ing information : 

#=609, (ABC =300, (opy) = 25, (A)— (a) = 96, (y)=135, 
<^4/3) = 180, (B/)= 102, (B)-(P)= 310. 

To compute (A) and (a), we have 

(/<)—(a)=96 and (^)+(«x) = 600 

and then we find (/<) = 348, *=252 

Similarly {B)-(p)= 310 >viih (Z*)-KP) = 600 
yields: {B) = 455, (0) = 145. 

Also y= 135 gives C=600 -135 = 465 
Now 04tf)=(/0-(^)=330, (BC)=(B)-(By)=353 
We see that (a£y) = 25 gives (AC) =310 in view of the for¬ 
mula : 

(*fiy)=N-(A)-(B)-(C)+(AB)-\(BC)+(AC)-(ABC) 

Now (ABy)=*(AB)-(ABC) = 30, 

(rj.BC) = (BC)-(ABC)^53, 

(A&C)=*{AC) - (ABC) — 10, 

<£y)My)~Wv)= 33, 



8 


Mathematical Statistics 


(a£)= 127, 

(apC) = (o(3)—(a/?y)= 120 

(aBy)=72. 

Thus the ultimate class frequencies are as follows : 

(ABC)=300, (aBC) = 53, (ApC)= !0, 

(ABy)= 30, ( a pc>= 102, (*By)=72, 

(Apy)=8, (a/8y)=25. 

3. Compute all the class-frequencies from the following data • 
Ar= 10,000, M)=877, (20=1086, (C)=286, 

(AB)=338, (AC)=1 43, (flC) = 135, 

(ABC)=S7 
As a matter of fact. 

Then 338 = 57-f (^5y) 
which yields : (ABy)=28l 

Similarly, (^C) and (£C) wiil give 

0*30=86 

(afiC)=78 

Now («pC)=(pC)-(ApQ 

=(C)-(BC)-(APC) 

=286-135-86 
= 65 

Similarly, we find 
(Apy)= 453 

(*By)=670 

Finally, ^“^-(^-W-fCJ+W+WO+^cj-^sc) 
Also (ay)=A^-(/0-. (C)+(^C) 

= 10000-877-236+143 

= 8980; 

(«0)=flr_ (A)-(B)+(AB) 

= 8375; 

(070 = 8 7 71 

13 /e *- //oH ' r,lar, y atleast must base lost all four ?’ ^ 

I>enote the attribute of losing an eve an par 
leg respectively by C and D Let N= m ThenV™ ^ * 
furn.shes the following information ; ^ Th he problem 


Theory of Attributes 


9 


(A) ^ 70, ( B) > 75 
(C) > 80, (D) > 85. 

Th 2 number of combatants who lost all the four is ( ABCD ). 
Hence, in view of Prop 6, 

{ABCD) ^ 04)+(5)-HC)+(Z))-3Ar 
^ 70+75+80+85-300 
= 10 

This implies that atleast 10 percent combatants have lost all 
the four. 

S. 100 children took three examinations. 40 passed the first , 
39 parsed the second and 48 passed the third. 10 passed all three, 
21 failed all three, 9 passed the first two and failed the third, 19 
failed the first two and passed the third. Find how many children 
passed atleast two examinations. Show that for the question asked 
certain of the given frequencies are not necessary. Which are they ? 

Denote the passing of the first, second and third examinations 
respectively by A, B, C. Then Ar=l00. (A)=40, (B) = 39. (C)=48 

(ABC)-l 0, (a/3y)=21, (ABy) = 9, ( a (3C)=19. To compute the 

number of children who have passed atleast two examinations, we 
have to calculate 

Which equaL^ )+( ^ CJ+(aBC)+( ^ C) 

<ABy) + (AC)-(ABC)+UC)~ w C) -f- (ABC), 

'■ e. (ABy)+(C)—(afJC) 
which is given by 9+48—1 =08 

(O* (®£C) and ( AEy ) are all which answer the question 
The five frequencies (including N) are redundant. 

11-6. Consistence. Class-frequencies observed within one and 

the same population are consistent with one other. They conform 
with one another, but in no case conflict. 

Assume that (4)=50 and (AB)=60. Then these figures are 
inconsistent because ( Ab) > ( a ) in view of the fact that they are 
observed from the same population. 

Condition for Consistence. The necessary and sufficient condi¬ 
tion for the consistence of a set of indr pendent class frequencies that 
no ultimate class frequency be negative. 

The reader should note that no class-frequency is negative if 
it occurs by counting real attributes. This proves the necessary 
condition for the consistence. Assume that we have any non¬ 
negative set of 2" numbers. Then we can always imagine a real 
population with n dichotomies such that they (dichotomies) must 



Mathematical Statistics 


10 

be the numbers for its ultimate class-frequencies. But it is impo¬ 
ssible for this real population to produce inconsistent results. This 
proves the sufficiency of the conditions. 

Remark. ‘No ultimate class-frequency is negative implies 
that a given data is consistent; equivalently, a data is inconsistent 
if anv of the ultimate class-frequencies is negative. 

Conditions for consistence of data. 

(a) Assume that there exists only one attribute A. Then t 

requested conditions are 

( A ) > 0 and {A) < N 

(b) Assume that there are only two attributes, say, A and B. 
Then the conditions for testing consistence are 

(1) (AB) < 0, i. *?. ( AB ) ^ 0 

(2) ( AB) < (A) + (B)-N 

(?) (AB) > (A) 

(4) (AB) > (B) 

(c) Suppose we have three attributes, say, A , £, C. Then the 
conditions that the eight ultimate frequencies are not negative are 

as follows : 

(1) ' (ABC) 2? 0 

(2) (ABC) > (AB)+(AC)-(A) 

(3) (ABC) ^ (AB)A-(BO-(B) 

(4) (ABC) > (/fC) + (£C)-(C) 

(5) (ABC) ^ (AB) 

(6) (ABC) < (BC) 

(7) (ABC) < (AC) 

(8) (ABC) ^ (AB) + (AC) + (BC)-(A)— (£)-(C)+# 

Proof of (ABC) ^ (AB)+(AC)-(A) 

In fact, (ABy) ^ (Ay). 

Then (AB)-(ABC) < (A)-(AC), 
from which we have 

(ABC) ^ (aB)+(AC)-(A) 

The eight conditions of (c) give rise to the following 4 more 
conditions : 

(9) (AB)+(AC) + (BC) ^ U) + (B) + (C)-iV 

(10) (AB(+(AC)-(BC) < A 

(11) (AB)-(AC) + (BC) < B 

(12) -(AB)A-(AC) + (BC) < C 

The inequality (^) is immediate fiom (l) and (8) of (c). (6) with 
(2) yields (10). The reader will verify (11) and (12) himself. 


Theory of Attributes 


11 


C’s. 


Illustrative Examples. 

1. (a) If all A’s are B’s and all B’s are C’s, then all A’s are 

(b) If all A's are B’s and no B’s are C’s, then no A’s are C’s. 
Then ° r ' ^ ^ hyP ° thesis reads that (AB)=(A) and (BC)=(B). 

(AB)+(BC)-(AC )< B*> 

(A) +(B) —(AC) < B=> (A) < (AC) 

But (A) < (AC) is impossible. Hence (AC)=A, i. e. all A’s 
are c s« 

(b) By hypothesis ( AB)=A, (BC)= 0. 

Then (AB)-(BC)+(AC) ^ A => 

(A)-{-(AC) < (A) => (AC) < 0 => 

(AC )—0 since no class-frequency is negative => no A’s are 

C’s. 

*• ?J ven th °\ (A) = (B)=(C) = %N and 80°/ o of the A’s are B’s, 
75 /o of As are C s, find the limits to the percentage of B’s that 
are C’s 

The problem furnishes the following data : 

8 (AC > n.7< 

M> -° 8 ’ jAf ° 15 ’ 


which, in view of (A)-iN, becomes 
Recall that 

( AB)+(AC)+(BC) > W+(B)+(C)-N 
(AB)+(AC)-(BO « (A)- ' 

(AB)-(AC)+(BC) < (B) 

-( AB)+(AC) + (BC) < (C) 

Then we have 

(a) ? 


N 


> 1-0*8-0 75 


/ 


( b ) ^ 0*8 +075— 1 

( c ) < 1-7-0-8+075 

< d * < 1+0*8—075. 

M^^ neXaminati °? Sh0WS,hatCa) provides a negative limit and 

ZZ’^Trir"’ '■ bt d -« 

> „■». u *a s „.5 , 


12 


Mathematical Statistics 


In conclusion, not less than 55 percent nor more than 95 
percent of B's can be C’s 

3 . rfa report gives the following frequencies as a:tually ob¬ 
served. show that there must be a misprint or mistake of some sort, 
and that possibly the misprint consists in the dropping of l bejore 

the 85 given as the frequency {BC) • N 1000. 

(A) 510 (/*£) l 89 

( B) 490 ( AC ) 140 

fC) 427 {BC) 85 

[Aora 19581 


Recall that 

(£C) + (/lfl)-M 1C) > {A)+{B)+{Ct—N 
Then {BC) > (510+490 + 42)—1000 —189— 10 
=> (BC > 98 =>• 85 <: 98 => 85 

can not be the correct value of {BC). We observe that all the 

conditions are met if we read 185 for 85. 

4. Among the adult population of a certain town 50% of the popu¬ 
lation are male , 60 percent are »< age-earners and 50 percent are 45 
years of oge or over. 10 percent of the males are not wage-earners 
and 40 percent of the males are under 45. Can we infer any thing 
about what percentage of the population of 45 or over are wage- 

earners ? 

Denote the attributes male, wage-earners and 45 years old 
or more bv A. B, C. Then we have the following data : 

N=100, M)=50, (B) = 60, tC) = 50, (40)-5, {Ay)= 20. We 

wish to compute the limits, it any of \BC). 

Now (AB)=(A)-{Ad)=45 

and {aC) = {A)~ +4y) = 30. 

The conditions of consistence are 

(a) (, 4 fl) + (fiC)+(/lC> > M) + (tf) + (C)-W 

(b) (, 4 fl) + (.-lC)-(fiC) ^ {A) 

(c) (. 4 B)-(/ir)TtfiC) < (B) 

(d) — {AB) + {AC)+{BC) ^ (C). 

They yield : ( BC) > 1+ l#C) > 25, 

{BC) ^ **5. and {BC) < 65 so that 

15 ^ ( BC) < J 5 (noting that 

{BC) < 65 is ruled out because 

{B) = 60 and (C) = 50-(«C) < ( B ). 

{BC) c; (C). 

In conclusion, the percentage of the population 45 years old 



Theory of Attributes 


13 


or more (50 percent of the total population) who are wage ear- 
ness lies between 50 and 90 percent. 

5 A market investigator returns the followihg data. Of WOO 
people consulted, 8II liked chocolates , 752 liked toffee, and 4 IS 
liked boiled sweets, 570 liked chocolates nnd toffee 356 iked 
chocolatet and boiled sweets and 348 iked ioffee and boiled sweets ; 
297 hked all three. Show that this information as it stands must be 
incorrect. [Agra 1956, 

Denote liking chocolates, toffee or boiled sweets by A. B, C. 
Ihen we have the following data to hand : 

#=1000, (i4) = 811, (B) =752, (C) = 418, 

(AB) = 570, (AC) = 356, (BC) = 348, 

(ABC) =297. Then 

( a py) =#- (A) — (B) - (C) + (AB) + (BC) + (AC) - (ABC) 

= 100-811—752-418-1-570 + 348 f 356-297 
= 1274-2278 
= —4 negative quantity. 

This claims that the information is incorrect because the 
necessary and sufficient condition for the consitence of a set of 
independent class-frequencies regarding a particular universe is 
tha no ultimate class-frequency computed from them ijr 
negative. 

6 . A penny is tossed three times and the results , heads and 
tails, noted. The process is continued until there are W0 sets of 
threes. In 69 cases heads fell first in 49 cases heads fell second, 
and in 53 cases, heads fell third, in 33 cases heads Jell both first 
and second, and 21 cases heads fell both second and third, show 
that there must have been at least 5 occasions on which heads fell 
three times , and that there could not have been more than 15 
occasions on which tails fell three times , though there need not have 
been any. 

Denote the attributes of getting a head in the first, second 
and third trial by A, B, C respectively. Then we have the follow¬ 
ing data : 

#=100 (/!) = 69, (B)=49, (C) = 53, 

(AB) = 33, (BC) = 21. 

Now we wish to show that (ABC) > 5 and (aPy) < 15. 

Recalling the conditions for consistence of data, we have 
(ABC) ^ (BC) + (AB)-(B) = 21+^3-49 

and («/3y) < (up) < N-(A)- (B) { (AB) 



14 


Mathematical Statistics 


«= 100—69—49+33=15 
Hence ( ABC) ^ 5 and (a/3 y) <15. 


J N ’ 


(B) 


=2x. 


(O 


= 3x 


. ( A *) 

and — 


N ’ N 
then the value of neither x nor y can exceed 


(AC) (5C) _ 
~N~ = ~~N y1 

[Agra 1969] 


Proof. Recall the conditions of consistence : 

(a) (5Q > (5) f (O -N 

(b) (AO > (A)+(O-N 

(c) (AB) ^ (A) + (B)-N 

(d) (BO < (5) 

( e ( (AO < (O 

(f) (AB) < (A). 

Dividing the above relations by N and substituting the given 
values, we find from (a) to (c) that 

y > 5x— 1, y > 4x— 1, y ^ 3jc— 1 
and from (d) to (f) that 

y < 2x t y < 3x y y < x . 


Then we claim that Sx < >• < x so that 5*—1 < x (which 
satisfies all these conditions listed from (a) to (f). This implies 
that Ax < 1 => *=£. The relation^ < x claims that y < $ for 
x < Hence x > £ and y > + This proves our assertion. 

8 . Given that (A)=(B)=( 0=\N and that (AB)/N=(AOI 
N=p, find what must be the greatest and least values of p in order 
that we may infer that (5C)/^V exceeds any given value , say q. 


Recall the fact that 

(AB)A-(BOA-(AO > (A)A-(d)A-(0-N t 

which, in view of our h\pothesis that 

(A)=(B)=(0=\N 

and (AB)IN=(AOIN=p, 

yields : 

NpA-(BC)A-Np ^ INA-lHA-bV-M* 

that is, 

(BC) > \N-2Np. 

Then dividing this inequality by N leads to obtain 

(BO/N > i-2 P ...(I) 

Recalling that (AB)A-(*0—(BO < A, we have, in lieu of 
(AB)/N=(AO/H=P and A = ±N, 

2pN-(BO < 

which yields : 


(BOIN> ?P~b 


..( 2 ) 




Theory of Attributes 


15 


The inequality : (AB)—(AC)+(BC) < ( B ), in view of 

{AB)IN={AC)IN=p, (B) = \N, 

furnishes : 

e.p<i ...(3) 

Our hypothesis reads that ( BC)IN exceeds q. Hence (1) gives 

\—2 p > q 

from which we obtain 

2q) ...(4) 

In view of (2), we find i—2 p ^ q which implies 

P ^ i (1 +2q) •••(5) 

The fact that p can not be negative claims that the lower 

limit of p is zero, and hence p > 0 •••(b) 

By dint of (4) and (6), we conclude that 

0 </> < i(l-2<7), 
and by virtue of (3) and (5), we find 

* 0 + 2 ?) 

This completes the solution of the problem. 

Exercises 

1. In a certain set of 1000 observations (A)=45 r (B)=23> 

( C)=14 , show that whatever the percentage of B’s that are A s and 
of C's that are A’s, it can not inferred that any B's are C's. 

[Aid. Apply the conditions : 

( AB)+{AC)+(BC) > (A) + (B)+(C)-N 
and (AB)+(AC)—(BC) < A. Then find 

“ 100 °- 45 -23-14)/JOOO 

or . 045 

TV N Z 

The first limit is evidently negative. The second one must 

olso be negative because —^>0^23 and >0 014. Conclu- 

sion : there exists no limit to (BC) greater than 0, and hence it is 
impossible to infer that any B's are C’s.] 

2. Consider the following data : 


- 1000-45-23-14)/JOOO 


_f^) + MC)__.04 5 
iV 2 


TV 

7000 

(AB) 

42 

(A) 

525 

(AC) 

147 

(B) 

312 

(BC) 

86 

(C) 

470 

(ABC) 

25 


Show that the data are inconsistent . 



16 


Mathematical Statistics 


[Aid. Apply the condition : (a/3 y) ^ 0. By computation, 
(cc£y) = -57J 

3 . Find all the class-frequencies from the following data : 

N= 10000, (A)=877 , ( B)=10S6 , (< C)=286 , (AB)=338 , (/1C)= 
143 , (. BC)=135 , (/4^C)=57 

[Ans. (ay) = 8980, (*By)=670, («/3y)=8310, (a|3C) = 65, (,4/3y) 
= 453, (^30 = 86, ( a £C)=78, (/l£y)=28l] 

4. The following are the number of boys observed with certain 
classes of defects amongst a number of school children. A denotes 
development defects; B, wtrve s/grts; C, low nutrition. 


(ABC) 

149 

(a BC) 

204 

(ABy) 

738 

(a By) 

1762 

(ABC) 

225 

(apC) 

171 

(Apy) 

1196 

(a/3y) 

21842 


Find the frequencies of the positive classes. 

[Ans. N 26287 ( AB) 887 1 

(A) 2308 04C) 374 

(B) 2853 (BC) 353 

(C) 749 04Z?C) 149 J 

5. The following are the frequencies of the positive classes of 
the girls in the same investigation : 


N 

23713 

(AB) 

587 

(A) 

1618 

(AC) 

428 

(*) 

2015 

(BC) 

335 

(C) 

770 

(ABC) 

156 


Find the frequencies of the ultimate classes. 


Ans. (ABC) 

156 

{cx.BC) 

179 

(ABy) 

431 

(ccBy) 

1249 

(ABC) 

272 

(*PC) 

163 

(AM 

759 

( a py) 

20504] 


6. Measurements are made on a thousand husbands and a 
thousand wives. If the measurements of the husbands exceed the 
measurements of the wives in 800 cases for one measurement in 700 
cases for another, and in 660 cases for both measurements , in how 
many cases will both measurements on the wife exceed the measure¬ 
ments on the husband ? 

[Ans. Assume A = husband exceeding wife in first measure¬ 
ment, 2?-=husband exceeding wife in second measurement, 
AT=ICOO, (.4)=800, (Z?) = 700, ( AR) = 660 t By computation, 

(«/3)=160, which we wish to find] 

7. In a war between White and Red forces there are more Red 
soldiers than White, there are more armed Whites than unarmed 


Theory of Attribute; 

Reds; there are fewer armed Reds with ammunition than unarmed 
Whites without ammunition. Show that there are more armed 
Reds without ammunition than Unarmed Whites with ammunition. 

[Aid. Denote the attribute of being white by A, armed by B 
and possessed with ammunition by C. Then (a) > (A), (AB)>(<*P). 
Show (a By) > (ApC) J 

8 . The following are the proportions per 10000 of boys observed 
for certain classes of defects amongst a number of school children. 
A=development defects , B=nerve signs , D—mental dullness. 

N =10000 (D)=7&9 

(/4) = 877 (AB )=338 

(B) = 1086 (BD)= 455 

Show that some dull boys do not exhibit development defects , 
and state how many atleast do not do so. 

[Ans. 117] 

9. The following are the corresponding figures for girls : 

N =10000 (D)—689 

(A)—682 ( AB)=248 

C B)=850 ( BD)=363 

Show that some defectively developed girls are not dull , and 

state how many at least must be so. 

[Ans 108] 

10 . 50 percent of the imports of barley into a country comes 
from Dominions: 80 percent of the imports go to brewing; 75 percent 
of the imports are grown in the North Hemisphere , percent of 
Northern-grown barley goes to brewiug ; 100 percent oj foreign 
Southern grown barley goes to stock-feeding. Show that t te foreign 
Northern grown barley which goes to brewing can not be less than 3 0 
percent nor more than 50 percent oj the total imports. 

[Aid. A = barley coming from Dominions 

B= barley used in brewing 

C=barley grown in Northern hemisphere. N=1C0, 
(/*)=» 50, {B)= 80, (C)= 75, (BC) =i a 0 °o x 7 :> = 6U, («Py) =• («y) 

Show ip.BC) ^ 30 and ^ 50] 

11. A social survey in a village revealed that there were more 
uneducated males than educated ones % and there were more educated 
employed males than uneducv ted unemployed ones. There were more 
educated em/loyed under 35 years of age than employed uneducated 
males over 35 years of age. Show that there are more uneducated 
employed males under 35 years of age than educated unemployed 

t- males over 35 years of age. 



18 


Mathematical Statistics 


12 . jf in a series of houses actually invaded by small pox 70% 
of the inhabitants are attacked and 85°/ Q have been vaccinated . What 
is the lowest percentage of the vaccinated that must have been 
attacked. [I. A. S. 1955] 

[Ans. 55/85 or 65 percent] 


117. Association of Attribntes 

Independecne. Two attributes A and B are said to be indepen - 
dent if there exists no relationship of any sort between them. Then 
we expect to find the same proportion of A's among the not— B's 
( i.e . /3’s). 

Criterion of Independence. The criterion of independence for 
two attributes A and B is as follows : 

(AB)_(Aft 

(*) (P) ...( 1 ) 

Proposition 8. Assume that the relation : 

(AB) __(Aft 
(B) (ft 

holds good. Then the corresponding relations : 

(aB) _(aft 

(B) (W ...(2) 

(AB)_(*B) 

(A) (*) ...(3) 

(A ft) _ (afi) 

( A > («> ...( 4 ) 

must also hold. 


Proof. The relation ^-<49 yields 


1 — 


(AB) 




(Aft 


( B ) ' (P) 

Then (B^—jAB) (ft-(A ft 

W (ft 

which, in view of the fact that 

. , (B)‘=(AB)-\-(xB) and (ft = (Aft+(aft 

furnishes : v v 


(«B)_( a p> 

(B) (W 

This proves (2). 

To prove (4), we write (I) in the form 

(AB)_(B) 

W) (P) ■ 



Theory of Attributes 


19 


Then 1+ (^f) = 1+ (f yie,ds 

(Afi) + (AB) (P)+(l) . 

(4®) ~ (p) * 


that is. 


(ii_ A[ 

P)—OS) 


This provides : 


(A) - (A) + (*> 

(Aft) {p) 


so that 


(fl _ M)+(«) . ■ («) 

W) (A) ^(A) 


Recall that ( fi)=(Ap)+(*p ). Then we find 



which reduces to 

(«ft) _ («) 

(am (Ay 


from which we obtain 

(Afi)_(«ft) 

(A) («) 

This proves (4). Similarly the identity (3) can be established. 
Remark. Assume that the frequencies are grouped 
into a table with two rows and two columns. Then it is easier to 
find the relations stated in the proposition. 


* 


Attribute 

B 

p 

Total 

A 

'( AB) 

(Aft 

(A) 


(aB 

(«P) 

(«) 

Total 

w 

(P) 

N 


(AB) = (m 

\B) (p) 


states a certain equality for the colu- 


Equation: 




20 


Mathematical Statistics 


mns. If this is true, the corresponding equation 

( AB)_(ocB ) 

(A) (a) 

must be true for the rows and so on. 

Proposition 9. If the attributes A and B are independent , the 

proportion of AB's in the population is equal to the proportional of 
A y s multiplied by the proportion of B's; that is, 

(AB) _{A) {B) 

N N ' N •••(*) 

{AB) __(A\ 3) 


Proof. Recall that 


{*) 


Then 


(AB)_{AB)+(A I 3) _(A) 


(B) (B) + &) 

giving : (AB)=^-^-^ 


N 


(A_B) = (A) (B) 
N N ' N 


which can be put into the form 

(AB) = {A) (B) 

N N * N 

This proves the proposition. 

Remark. The advantage of the form 

, (AB) _(A$\ 
the form ^ ^ 

is that they give expressions for the second order frequency in terms 
of the frequencies of the first order and the whole number of 
observations alone; but the form 

(.4/n HP) . 

</n = o>T does not - 

11*8. Association. Assume that attributes A and B are not in¬ 
dependent but they are related in some way or other. Then 

M) (£) 


(AB) > 


N 


implies that they are positively associated or simply associated 
otherwise 

(.4) (B) 


(AB) < 


N 


implies that A and B are negatively associated or more briefly dis¬ 
associated. 

Complete Association and disassociation. Assume that A and 
B are two attributtes. Then for complete association all A*s must 
be B's and B's must be A's. This implies that the /Ts and the B's 
occur in the population in equal numbers; equivalently, we say 


Theory of Attributes 


21 


that all A’s are B’s or all B’s are <<4’s according to whether the A 's 
or the B’s are in the minority. 

For complete disasocition no ^4’s are B’s and no a’s are /3’s, 
or more widely, either of these statements is true. Now we have 
the following wider definition. 

Two attributes are completely asociated if one of them can 

not occur without the other, though the other may happen without 
the one. 

Illustrative Examples 

Ex. 1. Show , as briefly as possible , whether A and B are in¬ 
dependent , positively associated or negatively associated in each of 
the following cases : 

(a) JV=5000, 04)=2350,(5) = 3100, (/15)=>1600 

(b) (A) = 490, (AB)= 294, (a)=570, («£)=380 

(c) (AB)= 256, («5)=768, 04P) = 48, («p) = 144 




=> A and B are positively associated. 

(b) (AB )=294 with (^)*= 490 gives : 

(^) = 294_ 

(4) 490 

and (a£)=380 when combined with (a)=570 provides : 

M) = 380 

(a) 570~~"*’ 

In conclusion : , and hence A and B are nega - 


tively associated. 

(c) 

and 

Hence 


(AB) __256 
(a B) 768 

Cd6) 

M> 

(AB) 

(*B) 


1/3 

L»48/144=l/3, 

(££> 

(«/*) 


This implies that ^ and 2? are in dependent attributes. 

Ex. 2. Z)o yow find any association between the temper of 
brothers and sisters from the following data : 

Good-natured brothers and good notured sisters : 1230 
Good-natured brothers and sullen natured sisters : 850 
Sullen natured brothers and good natured sisters : 530 
Sullen natured brothers and sullen natured sisters : 980 


(Agra 1957) 



22 


Mathematical Statistics 


Assume that ( A ) denotes good natured brothers and ( B ) good 
natured sisters. Then the nature of the problem provides : 

(.42?)= 1230, (/4/S) = 830, (afl) = 530, (<x/3) = 980 
so that (A)=(AB) + (Afi)= 1230 + 850=2080 
and (B) -(AB)-\-(y.B)= l 230 + 530= 1760. 

Percentage ol good natured sisters among the sisters of the 

good natured brothers is 

W * 100 ^* 100=59 

and percentage of good natured sisters among the sisters of the 
sullen brothers is 

Conclusion. Comparison between two percentages reveals 
that there is a positive association between the tempers of brothers 
and sisters. 

Ex. 3. The male population of the Utter Pradesh is 250 lakhs. 
The number of liirate males investigated is 20 lakhs , the total number 
of male criminals is 26 thousands , and the number cf literate male 
criminal .< is 2 thousands. Is there any association between literacy 
and criminalit ? [Agra 1951] 

Suppose A denotes literates and B denotes criminals. Then in 
lakhs wa have, 

(A)= 20, (£) = :6/100, {AB)= 2/100 and A r =250. 


Evidentlv 


M) < {A T 


Then we claim that literacy and criminality are negatively associa¬ 
ted; that is, literacy checks criminality. 

Examples 

1. Show whether A and B are independent, positively associa¬ 
ted aud negatively associated in each of the following cases ; 

(<i) N= 1000, (A) = 470, (B) = 620. (AB)=320 

(b) (A) = 9S0, (AB)=588, (-*)=! 140, («Z?)=760 

(c) {AB)=256, (a B) = 768, (Ap)=48, (a£)=144. 

[U.P.CS. 1961] 


[Ans. (a) —— =291-4 < 320, and there is 

A 


a positive 


association 1 . 



Theory of Attributes 


23 


(b) 


(AB) 588 


3 (aB) 760 
5 ’ (a) “1140 


2 

5 


(A) “980 “ 5 * (a) 
there is a negative association. 

( c ) A and B are independent.] 

2. From the following, decide whether blindness and boldness 
ore associated: 

Total population = 16,264,000: number of bold heoded=24,441, 
number of blind =7,623, number of bold-headed blind=22l. 

[Agra 1961] 

[Ans. — ^ — 11 < (AB) =221 and hence blindness and 
boldness are positively associated.) 

3. In a state with a total population 70.000 adults, 34,000 are 
males and out of a total 6,000 graduates 700 are females. Out of 
1200 graduate employees of the state , 200 are females. Is there 
any sex bias in education among people ? The state holds that 
no distinction is made in appointments in respect of sex. 

[U.P.C.S. 1953] 

[Ans. Yes, there is sex bias in education because the percen¬ 
tage of male graduates is —34 qJq ° — 16 and the percentage of 

female graduates is - 3 ^— 2 . 

The number of females employed out of 1200 graduates is 

-— = 140, which is less than the actual number of female 

employees. Hence “the state holds that no distinction is made in 
appointment in respect of sex’* is justified.] 

9. Symbols (A B) 0 and 8 

The symbols we shall use are as follows : 

( AB )o =ym ; w )o =(jm 

8 =(AB)-(AB) 0 -(A£)-WW 

Also (a/3)-(«p) 0 =«/W- ( -^ 

=N-(/t)-(B)+(A£)~ -(B)) _ 



24 


Mathematical Statistics 


*=N-(A)-(B) + (AB)~ + 

~(,B)-^=8. 

In general, 

(A B) — (A B) 0 = (*fi) —(«P)o = (dp) (A ft) 0 

=(*B)-(xB) 0 =8. 

Proposition 10. 8 = ^- [(AB)('x/3)-(*B)(Ap)l [Agra 1970] 

Proof. Recall that S=(AB) — - ^ \ 

Then (S=iJ (,4B).V-(A)(B) j. 

Expressing all the frequencies of the numerator in terms 
those of the second order leads to obtain 

8 = ^ [(/IB) ((/tfl) + («B) + (/l/3)+(a/3)} 
-{(AB)+(Am(AB)A-WB)) j 

Slight simplification yields : 

S=^l[(.tBKa?)-(o 1 B)(.-tB) ]. 

proving the proposition. 

Remark The common difference (AB) — (AB) 0 equals I fN of 
the “cross products'* (AB)( a£) and (uB)(Af3). 

The form given in the proposition is useful to note. 

The difference of the cross-products may be large if jV be 
large, although 8 is in infact very small. 

(AB)=(AB) 0 says that A and B are independent attributes 
whereas ( AB ) > (AB) 0 implies that A and (B) are positively asso¬ 
ciated and (AB) <. (AB) 0 claims that A and B are negatively 
associated. 

Proposition It. Define 8 as follows : 8=(AB) — (AB) 0 
Then 

(а) (AB)*+(*py*-( a B)*-(AW = ((A)-( a .)][(B)-(P)] + 2N8 

[Agra 19^0] 

(б) 6 = ( -K g ir ( --- ( —1 

1 ' N LM) (ot) J 

(BXf3)UAB) _(A,9) ] 

~ JV 1(B) (ft J 


[Agra 197(PJ 



Theory of Attributes 


25 


Proof, (b) Definition of S leads to obtain 


8=(AB) 


<A)(B) 

N 


which may be written as 

(A)(a) r (AB) N (2?)] 

N L GO(a) (a)J 

which, by dint of 7V=(<4)-f( a ) 

and p=(AB)+(cnB}, assumes the form 

{AX o)T (AB) UA)+(a)}-(A) {(AB)+( a B ) 
N [ (A)( a) 

= (A)fa) UAB) («) _ («/?) (A) -] 

N L (A) (a) (^)(«)J 

which collapses into 

* (A) (« )f (AB) (*By | 

" LW(a)J* 

proving the first part of (b). 

Again 5 =(AB)- ( -^^l 


can be put into the form 

(B^f UAB) N (A) 1 
I (B)(ft) - (p) I 


N 


(Bm 

(B)(ft) UAB) N- (A)!B) 1 

“ N l (S)W) J 

^(BW) r (AB){(B) + ( ft ))- {(A B) 4- (AM ( 

N L ' (B)W J 

_Qg)(3)r (AB) (Aft) 1 

N 1(B)- (3) J* 

after slight simplification. This proves the second part of (b). 

To prove (a), the right-band side of (a) yields : 

K' 4 )—(«)] [(*)-(3) 1+2 N8 

•=l^B) + (Af)-( a B)-(afi)] \(AB)+(cB)-(Afi)-(,(,)] 

+ 2 l(AB)N-(A) (B)] 

'=K A B)-(*B)+(Ap)-(<,p)l[rAB) + (aB)-{(A/9)-l-(aP)}] 

+2[(AB)((AB)+(A0) + (aB) + (<'fi) ' 

—((AB)+(A P)} {(AB) + (aB);] 

=(AB)*-(A p)* - (aB t *+(aft t *-2(ABy*ft) 

+2(Aft)(«.B) + l(AB)W)-2(Afi((c'B) 


=(AB)*+(aft)*-(«B)*-(AW. 
This proves (a). 

Proposition 12. Assume that 



26 


Mathematical Stctistics 


(A B) lt (aB) lt (APU, («(*)* 
and (AB) 2 , (*B) a , (A?) 2 , (aP) M 

are the two sets corresponding to the same values of ( A ), (B), (a) 
(P). Then 

(AB) t — (AB)z =(«#)a -(ot*)i =(A /3),-( AP) t 
=(«P)i—(a/3)a- 

Proof. Evidently (A)=(AB) l +(Afi) 1 
and (A)=(AB) 2 +(AP\. Then 

(AB^+iAp^iAB^+iAPU, 

yielding : 

(AB^-CAB^CAPh-iAPh ...( 1 ) 

Similarly, (a)=(x/?)i-f (a^) 1 =(a^) 3 + (a^)i 
provides : 

(«*),-(«*),-(«0W«P). -(2) 

Now (^) = (/l/3)i + (ot?) 1 = (^P)a+( P)a furnishes 

(A ft) 2 — (AP)i = (aP)j — (aP) a ...(3) 

The proposition follows in view of (1), (3) and (2). 

11*10. Coefficient of Association (due to Yule) 

The coefficient of association Q is defined by the expression 

(AB)(xP\-(AP)(*B) 

U-(AB){tf)+(AfiK*B) 

which, in view of 6 = (AB)(*p) — (Afi)(aB) J assumes the 

form 

0 = _M_ 

^ {AB)ia$)+{AP)[*By 

Remarks. 

R, : Independence of attributes implies that Q is zero, and 
then S = 0. 

R a : Complete association of attributes claims that Q takes 
the values +1 ; that is, Q= 1 iff (AP)(olB)—0. 

R 3 : Complete dissociation of attributes requires that Q 
as.umes the value —1 ; that Q = — 1 iff (AB)(gl$)=0. 


Coefficient of Colligation 

The cotfficient of colligation is defined by 



zA 

+ 7C 


(Ap)(aB )\ 

(AB)(aP)/ 

(AP)taB) 

(AB)(aP) 


This coefficient also possesses same properties of Q . 



Theory of Attributes 


27 


Proposition 13. An interlinked relation between Q and Y is 

2 Y 


Q= 


1 +l^- 


Proof. Letting A=* n Y leads to obtain 

k ab ) (W 
Y _ 1 - a/A 

1-l-VA 

which, on squaring, provides : 

V 2 = 1 + 

1+A + 2VA 
I -f- A—2 a/A 


Then 


i+r 2 =i + 


1 + A -f“ 2\/ A 

2 (I -f A) 


so that 


Then 


l+A-f2VA 

2 _1+A+2VA 
1-fT* 1-fA 

2K _1+A4-2 V 'A 

1 + Y* 1-fA 

i-A 
1-fA" 

= Q- 


1-VA 

i -f Va 


This proves the proposition. 

Illustrative Examples 

Ex. 1. The male population of a certain state in India is 331 
lakhs. The number of literate males is 66 lakhs , the number of male 
criminals is 33 thousands , and the number of literate male criminals 
Is 6 thousands. Compute the coefficient of association between 
literacy and criminality in this state. 

Denote literate males by A and criminals by B. Then the 
problem furnishes the following information : 

W=33l lakhs, (/!)«= 66 lakhs, (Z?) = 33 thousands, ( AB) = 6 
thousands. 

To compute the coefficient of association between A aod B> 
we have to determine (Ap). (afl) and (<*p). 

Now (A$) = (A)-(AB) 

= 6600000—6000 
= 6591000; 

(a/?) — (B) — (AB) 

= 33C00-6000 = 27,0:0 



28 


Mathematical Statistics 


0*P) = (a)—(a B) 

—N—— 

= 33100000 - 66 OOO 0 O—27000 
= 26473000 

f) _ {AB) (at fl)--(«£) (A&) . 

Q {AB) («/3) + (a B) (A?) glVCS * 

^ 6000x2 6473000 - 6594000 x 27000 

^ 6000 x 26473000-f6594000 x 27000 

= -0836 

Thus we conclude that there exists negative association of a 
very high degree between literacy and criminality. 

Ex. 2. In an experiment on immunisation of cattle from tuber¬ 
culosis, the Jollowing results were obtained 

Died {or affected) Unaffected 
Inoculated 12 26 

Not inoculated 16 6 

Examine the effect of vaccine in controlling succeptibility to 
tuberculosis. [I. A. S. 1948, Agra B. Sc. 1971] 

Denote inculation by A and died (or affected) by B. Then 

(AB)= 12 , (A/3) = 26, (ccB)=16, (a/3)=6. 

and hence 


and 


Then 


0 _ (AB) (x3)-MP ) (*B) 
L {AB) (aP)r(AP) (*B) 
12x6 —26 x 1 6 
1 2 X 6-J-26 x 16 


-344 
488 = 



Conclusion. There exists negative association between vaccine 
and susceptibility to tuberculosis; in other words, the vaccine 
prevents the attack of tuberculosis to a great extent. 


Exercises 

1. The following table is reproduced from a memoir written 
by Karl Pearson 

Eye colour in Son 
Not light Light 

Eye colour in Father Not light 230 148 

Light 151 471 

Decide whether the colour of the son’s eye is associated with 

that of the father. 



Theory of Attributes 


29 


[Ads. Denote the light colour of father by A and that of son 
by B. Then 

n _471 x230-148xl51 
^ 471 x230+148x 151 
= +0*657, 

showing that there exists positive association between the eye 
colour of father and son to a fairly hight degree.] 

2. Investigate the association between darkness of eye-colour 
in father and son from the following data : 

Fathers with dark eyes and sons with dark eyes 50 
Fathers with dark eyes and sons with not dark eyes 79 
Fathers with not dark eyes and sons with dark eyes 89 
Fathers with not-dark eyes and sons with not dark eyes 782 
Also tabulate for comparision the frequencies that would have 
been observed if there had been no heredity. 

[I. A. S. 1954; Delhi 1967] 

50x782_79x89 

[Ans. 0 = 5 q X 7^2 + 79x89 = ’^° posilive association of 

a high degree between the eye colour of father and son. 

Assume that there had been no hereditary Then 

(A) = 129, (5) = 139, (a) = 87I, A^=1000, and 8, 

^1 H. ( ^P-^ 7 50, WW-,2,J 

3. Compute the coefficient of association between the types of 

college training and success in teachin t from the table : 


Interaction 

Successful 

Unsuccessful 

Total 

Teacher*s College 

58 

42 

100 

University 

49 

51 

100 


Total 


107 


93 


00 


(Rajasthan 1956) 

[Ans. Denote Teacher’s College by A and successful by B. 
Then {AB) = 58, (Ap)=42, (aB)=49, (aj8)=5l; 

n _58x5!-49x J 2 
^ 58x51+49x42 =0 ' 17 J 


30 


Mathematical Statistic ? 


4. Can vaccination be regarded as a preventive measure for 
small pox from the data gi ven below ? 

'Of 1482 persons in a locality exposed , to small-pox , 368 in all 

were attacked’. 

*Of 1482 persons, 343 had been vaccinated and of these only 35 
were attacked (Agra 1968) 
[Ans. Denote vaccination by A and attack by B. Then N=. 
1482, (/1) = 343, (B) -368, (AB) = 55, (a(S) = 806, (/ip) = 308, (a B) = 
333, Q = — 0*5 negative association. 

Interpretation. Vaccination is a preventive measure against 

small-pox] 

5 . In experiments on the immunisation of cattle from tuber¬ 
culosis the following results were secured : — 



Cattle 


Treatments 

Died of tuber¬ 
culosis or very 
seriously affec¬ 
ted 

1 

Unaffected or 1 
only slightly 
affected 

Total 

Inoculated with vaccine 

6 

13 

19 

Not inoculated or ino¬ 
culated with control 
media 

8 

3 

11 

Total 

14 

16 

30 


Find the coefficient between inoculation and exemption from 
serious tuber . ulos s. 

[Ans. 0 =+0 70] 

6 . In an experiment the following results were obtained : 

Four children Well-to-do 

Percent * children Percent 
Below normal weight 55 13 

Above normal weight 11 48 

Calculate the coefficient of association between the weight of the 
children and their social status. 

[Ans. {? = -1-0*90] 




Theory of Attributes 


31 


7. Comment on the following statements : 

(a) 99% of the people who drink beer die before reaching 100 
years of age. Therefore drinking beer is bad for longevity . 

(Agra 1970) 

(b) More tax-payers die in a year than the general death rate 
would indicate. Thus it is unhealthy to be rich. 

(c) It is found that 80% of those who smoke develop indiges¬ 
tion in middle age. Then smoking is bad for digestion. 

(a) The version indicates that we are not provided the 
percentage of the people who do not drink and reach the age of 
100 safely. That is to say we have only ( AB)/(A ) to hand but not 
(afl)/(«). The percentage of deaths before reaching the age of 
100 for those who do not drink beer is given 90. This implies 
that drinking beer has no effect upon the span of life. In fact, if 
this percentage is less, then it must support the statement that 
drinking is bad for longevity. In this case, ( AB)l(A)>(<xB)/(<x ) so 
that A (which stands for drinking) and B (for death) are positively 

associated. 


12 

THEORY OF SAMPLING 

12.1. Sampling. Whenever a large population has to be exa¬ 
mined with respect to a specified characteristic, we choose a sample 
of individuals from that population and, from the properties of 
the sample relating to the given characteristic, we endeavour to 
estimate those of the population. The theory of sampling has two 
objectives, (i) estimation of the properties of the population from 

those of the sample, (2j making certain deviations from the 
true values that may he expected in the estimate obtained. 

Fundemental to the theory of sampling is the concept of 

random sampling Random sampling is defined by the property that 
in the selection of an individual from the population, each member 
of the population has the same chance of being selected. 

12 2. Sampling of attributes. In the sampling of attributes, we 
are concerned whether the individual selected in sampling possesses 
or non-possesses the specified attribute or characteristic For ins¬ 
tance, in sampling from a population of men, we may be concerned 
whether they are smokers or non-smokers. In sampling from 
births our consideration may be whether the baby is male or 
female. We shall call the selection of an individual selected a 
‘success’. Ramdon sampling in which the probability of success p is 
constant at each trial is called simple sampling. The value of p 
is the relative frequency of the occurence of the attribute in the 
population from which the sample is drawn. In simple sampling, 
p remains constant and its value at any trial is independent of the 
success or failure of pieceding trials. Thus sampling to be simple 
either the population must be very large or the individual selected 
must be retur ed to the population before the next trial success or 
failure having been noted. 

Mean and standard deviation in a simple sample of n members. 

1 he drawing of a simple sample of n members is identical with the 
series of n independent trials in which the probobility of success 

P is consistant. The probabilities of 0, 1, 2, . successes in a 

simple sample of n members are the series of the binomial expan¬ 
sion of (.</+/>)", where q=\—p. This binomial probality distribu¬ 
tion is called the sampling distribution of the number of successes 



Theory of Sampling 


33 


in the sample. Thus if x is the noted number of successes in the 
sample of n members, we have 

E ( x)=np y Var ( x)=npq or S D of x=y/(tipq) 

This S.D. is usually called the slandered error (S E.) of ihe num¬ 
ber of successes in a sample of size n. The deviation of the obse¬ 
rved number of successes from the extracted value np is looked 
upon as ‘error*. 

— is the proportion of successes in the sample of size n and so 

S. E of the number of success =^(npq) 

S E of the proportion of successes = 

where n is the sample sire and p is the constant relative fre¬ 
quency of the occurrence of the attribute in the population from 
which the sample is drawn 

The precision of the proportion of successes — is proportion 



to )• F° r doubling the precision, it is necessary to take the 

sample size as 4 n. 

12*3. Test of Significance. Large Samples. 

We have seen above that the sampling distribution of the num- 

ber of successes x and the proportion of success— in a simple sam¬ 
ple of size n are binomical distributions with parameters n and p. 
We know that for large values of n, binomical distribution approxi¬ 
mates to the normal distribution and the probablities for corres¬ 
ponding intervals in the two distributions tend to equality as n in¬ 
creases indefinitely. Thus the variable 

Vinpq) 

is the standard normal variate for large n and 


P( 1Z1 >2)=0 0456 
P(\Z >')3) = 0 0027 

We may therefore conclude that, for large sample size n, the 
probability that the number of successes x will deviate from its 

expceted value np by more than three times the S. E. is also small 

^it is only and that the deviation of more than twice the S. E. 

is rather unusual. 


34 


Mathemetical Statistics 


Bearing this in mind, we have now devised a test to judge the 
hypothesis that a given large sample, of n members, was obtained 
by simple sampling from a population in which the relative frequ¬ 
ency of the occurrence of the attribute under consideration is p. 

Suppose if it is found that |x— np\>3\/(npq), then an event 
has happened which, on the hypothesis of simple sampling is very 
improbable. In this case the difference or ‘error* |x— np\\s highly 
significant and the truth of the supposed hypothesis is very impro¬ 
bable. Here we will suspect that either the value of p employed is 
incorrect, or also that the conditions if simple sampling were not 
observed. The deviation \x—np\<2y/(npq) is regarded as not signi- 
ficent. For deviations |x— np\>2y/{npq) t the significance increases 
with \x—np\. 

Significance is usually regarded as begnining where 
P{\x—np\>2\/(npq)} is less than 5%. It may be noted that the above 
test may furnish evidence against the hypothesis, but it can not prove 
the hypothesis correct. At most, we can say that the test provides 
no evidence against the hypothesis. 

If in place of the number of success x, we take the proportion 

of successes —in a sample of size n. then 

n ’ 


1 x 


\/(rq/n) 

is the standard normal variate for large n and we use normal 
probability integrals for it. In some cases the value of p in the 
population is not known, and it would be desirable to substitute it 
by its estimated from the sample without serious error since the S.E. 

^ is small when n is large. 

The limits for the proportion of successes in the population 



are 

Ex. 1. A coin is tossed 400 times and it turns up head 216 

times. 


Discuss whether the coin may be unbiased one. 

We make the hypothesis that the coin is unbiased. On this 

assumption the probability of a head turning up is 

Hence expected number of heads in 400 tosses=400x J = 200 
ie. np = 200. 

The observed number of heads is 216 i. e. here *=216 



Theory of Sampling 


J 


35 


S. E. of the number of heads 
Thus Z= 


= V400£| *^ = 10 

.,*£i•6/Xvhich'isless than 2. 

V Kppq) ■ / 

The value of Z is not significan/aad the difference of 16 may be 

explained as due to fluctuations op'Wple sampling Thus the data 

provide no evidence against, tfsjf hypothesis. 

Ex. 2. A certain cubical/die f>a'$ thrown 9,000 times, and a 5 
or 6 was obtained 5,420 time/. Onjtln assumption oj random throw¬ 
ing, do the data indicate an/unbiased/diefU' 

Suppose we make fjU hypothesis that the die is unbiased and 
so the probability p of jy5 or a yis f i. e. J and q= f. 
W p=expected number of Recesses=9000 x $ = 3,000. 

x=observed number off successes=3,2 40 
Thus | x-np/\ = 24U. / j 

S. E. of the/4iumbe/ of successes 

J(npq)= v/(9000 x h x $) 

1 fO v '20 = 44 72 


Hence Z= 

v irpq) 


1240 


*4, which is greater than 3. 
i^hly significant and the deviation 240 is 
as a result of simple sampling with p = i. 
Therefore we conclud^that the die is almost certainly biased, and 
that p is not equal to 

We can however estimate the limits within which the true 


This value of Z is 
most unlikely to appe 


value of p lies. 

Estimate of p from the sample0*36. The S. E. of 
the proportion of successes is then 

and 3«'=0*015. . Hence the true value of p almost certainly lies 

between the limits 0*36:±:3e' i. e. 0*345 and 0*375. 

Ex 3. A sample of 900 days is taken from meterological 
records of a certain di trict and 100 of them are found to be foggy. 
What are the probable limits to the percentage of foggy days in the 
district ? [B. Sc., Kurukshetra 1971] 

p = Estimate of the proportion of foggy days=-£2o = i and, 
q<=%. 



36 


Mathematical Statistics 


Then «=Standard error of the proportion of foggy days 

=0-0105 

and probable limits of the percentage of foggy days in the district 

= 109{p±3c} 

= 100 {|±0 0315} 

= ll-ll±315 

/. e. 8 percent and 14*25 per-cent. 

Ex. 4. In a locality of 18,000 families a sample of 840families 
was selected. Of these 840 families, 206 families were found in 
having a monthly income of Rs. 50 or less . It is desired to estimate 
how many out of the 18,000 families have a monthly income of 
Rs. 50 or less. W ithin what limits would you place your estimate ? 

(U. P. C. S. 1948) 

Here />=proportion of families having monthly income of 
Rs 50 or less 


so that 
Now e 


= I2S = * 245 
= 1 —p = ’755 

S. E. of the proportion of successes 


=V(?W( : 


245 x -755\ 
840 / 


= 0015 


The required range in percentage is given by 
100 [p± 3 /^) = 100 [ 245±0 045) 


i. e. 20 percent and 29 percent 

i. e. between 3.600 and 5,220 families are most likely to have a 
monthly income of Rs. 50 or less. 


Ex. 5. A random sample of 500 pine apples was taken from a 
large consigment and 65 were found to be bad. Show that the S. E. 
of the proportion of bad ones in a sample of this size is 0'015 and 
deduce that the percentage of bad pine apples in the consignment 
almost certainly lies between 8 ‘5 and 17'5. (I. A. S. *54) 

Here /> = proportion of bad pine apples= fi o§ = ‘13 
so that <7 = *87. 

Now e=s. e. of the proportion of bad pine apples 

/f 13 x*87 \ 

a/ l 500 ; 

= •015 approx. 



Theory of Sampling 

ween* en ce Percentage of bad pine apples almost certainly lies bet- 
' vce , 100 (p-3e) and J00 (/>+3e) 

/. e. between 8*5 and 17*5. 

12*4. Comparison of Large Samples. 

Suppose we have to test two populations P t and P 2 for the 

fam Va | CnC r ° f a c . erta,n attribute. We take from them large simnle 
amples of * and1*. individuals and observe that „ and 

the proportions having the attribute. Is the difference p 

significant of the real difference in the populations P P * 

We assume that the proportions in the two populations are 

sttsstk a ? 

r p P op S ui 0 a t ; b nr:h! , c e h-I st eStimate ' * ° f ,he “«"* P-P-"- «» 

j- n iP\+ntP* 

The s. e. (standard error) of the proportion in the first 

sample= J{—\ 

The s. e. (standard error) of the proportion in the second 


ji 


J t 


^)] 


since H-.ro* ~P.) = V( Pl ) + y (Pc) ^ p J±+±\ 

As the populations are assumed to be similar * 

Since n x and n % are large 

Pi-Pi is a N (0, € *) variate 
Hence p ((/?,— /?*)>3 € ) j s very small; 

a ° P is approximately 5 percent 

Conventionally, if the difference may 

“ d “* ‘° rand0m sam P lin 8 variations, 1 as not significant 

nor,in„ fXt SUPP ,° S ° We haVC tWO P°P»l a 'ions P L and P, with pro- 

^ ‘ h ° *wo I’arge sitn^le 

and J* f lie*, l"* reS ? eC,,VeIy 8ive ,he Proportions p,- 

real . C ^ ? In other words, is the 

sampling ? CC tWCC " the P ° puIaiions »*ely to be hidden in 

Pi-P^ is normally distributed for large n t and « a . 



38 


Mathematical Statistics 


We have E {pf — p z ') = E ( pf) — E (Pz)—Pi—P* * 

< 2 = y(. P i-P2')=y(pi)+v(Pt)=~ +~~ 

Thus pi -Pi (pi-p«, « 2 ) 

Pi'-Pi' sC 0 => (pi'-Pzl-iPi-Pz) must be on the negative 

side, and numerically greater than pj—Pz- 

U Pl -p 2 >3e then P {pf-pf < 0 >3e and the probability ot 

this is very small. 

If PlZP* <2, such an event would not be very unusual. For 

€ 

the particular case in which/? 1 —p 2 =2e the probability of the 
event is in or near the interval 2 —2i%. 

[Note. Take Z= ( /?i ~ P * /. e. Z~./V (0, « a ). 

Thus P [pi — pz ^ 0] => P [Zc+pi — /?a ^ 0] 

=> P J Z < — { Pl \ P% )] : Pi~/? 2 >l’645e, then 

P £ Z < is less tban 5 %’ and we say that 

it is unlikely that the difference will be hidden in sampling] 

Ex 1. In a simple sample of 600 men from a certain large 
city , ‘tOO are found to be smokers. In one of 900 from another large 
city, 450 are smokers. Do the data indicate that the cities are 
significantly different with respect to prevalence of smoking among 

? (B. Sc. Delhi ’66, 6^ 

Here p,= lnn — 3 n __J 

' 1 oou 3» P* — eoo — 

Assume that the cities are alike with respect to the prevalence 
of smoking among men. The best estimate of the common value 

of p , )9 

_40n-MM) _ 17 
p 000 + 900 ' 30’ 

«'•'*= r (ri— P .f)=pq (-L-f—•) 

\ n \ f 

)=°- 


so that 


=L 7 v 11 / + ' 


30 20 \600 900 

=> 6=0 026 

Hence ^>=^=9-6. 


C00682 


The observed difference is much greater than 3e and is there- 



Theory of Sampling 


39 


fore highly significant. Our assumption that the populations are 
similar is therefore almost certainly wrong. 

Ex. 2. A machine puts 16 imperfect articles in a sample of 
500. After machine is overhanded, it puts out 3 imperfect articles 

in a batch of 100. Has the machine been improved ? 

(U. P. C. S. 62) 

Here Pi=5 1 o 9 o=‘° 32, Ps —i§v='03 
so that p t — p a = *002 

We make the 1 hypothesis that the machine has not improved. 
On this basis the estimate of the common value of p obtained by 
combining the samples is 

16+3 _ 19 

p== 500+100 600 


Hence {p~pj=j[pq (^+^)] 


„ /rj.^58.1 M 

71_600 600 \5C 
Therefore ^l£s_=0104<2. 


500 + 100 


W] 


=0*0192 


e 


that is, the difference *002 is not significant and we conclude that 
the machine has not been improved. 

Ex. 3. In two large populations, there are 30 and 25 percent 
respectively of fair haired people. Is this difference likely to be 
hidden in samples of 1200 and 900 lespectively from the two popu¬ 
lations ? [B. Sc Agra ’58, B. A. (Hod’s) Delhi ’54] 

Here p l = 1 §§=0 30, Pa=ioo=0*25 
so that pt— p 2 =0*05. The variance of the difference of the pro¬ 
portions in the samples is 

Ptf i i P*9 1 

«i n z 

_(0*30) (0 70) , (0*25) (0*75) 

1200 yuo 

=0*000383 

*—0-0193. Hence *=*-^5-2-6 

and P[Z < -2*6J<£%. 

Therefore it is unlikely that the difference will be hidden. 

Ex. 4. If for one half of n events , the chance of success is p 
and the chance of failure q, whilst for the other half, the chance of 
success Is q and the chance of failure is p. Show that the standard 



40 


Mathematical Statistics 


deviation of the number of successes is the same as if the chance of 

success were p in alt the cases i. e. \\npq) but that the mean of the 
number of successes is nil and not np. [B. Sc. Cal. ? 64] 

Suppose x, and are the number of successes in the first and 
second half of n events. Then 


E = E (a-J = m l 

v GrO-ag?. V (A 1 )='-f. 


Hence E (x l +x,)=E (a,)+£ (a s ) = -|- 

V (a*, + .v a )= V (x 1 )+ V (x 2 ) = npq 
I hat is the variance is the same as if the probability of success 
in all the n events is p. 

Ex. 5. The sex ratio at birth is sometimes givenhy the ratio of malt 
to female births, instead of the proportion of male to total births. 


If 2 is the ratio i. e. z=~ r show that the standard error of z is 


approximately 



, n being large so that deviations 


are 


small compared with the mean. 

We are given z=p/q, p being the proportion of male births 
and if the proportion of female births. 


New 



P r= _^ 
q I -p 




According to the given condition 

s. e. of :-s. e. of the male to total birth 


~ 3. C. Of p = 





Exercises 

I- A die is thrown 91.00 times and a throw of 3 or 4 is 
observed 3240 times Show chat the die cannot be regarded as an 
unbiased one. and find the limits between which the probability of 
a throw of 3 or 4 lies [B. A. Hon’s Delhi *63} 

[Ars. The limits arc 345 and 375} 

2 . A biased coin was thrown 400 times and heads resulted 
240 times Find the s. e of the observed proportion of heads and 
deduce that the probability of getting a head in a throw of the 
loin lies almost certainly between 0-53 and 0 67. 


IB. A lion’s Delhi ’ 64 , B Sc. Kurakshttra 65] 



Theory of Sampling 


41 


3 Certain crosses of pea gave 5321 yellow and 1804 green 
seeds. The expectation is 25 percent of green seeds on a Mendelian 
hypothesis. Can the divergence from the expected value have 
arisen from the fluctuations of simple sampling only ? 

4. 400 eggs are taken at random from a large consignment 

and 50 are found to be bad. Estimate the percentage of bad eggs 
in the consignment and assign limits within which the percentage 
probability lies. 17 5 and 17 5 approx ] 

5. Given that, on the average, 4% of insured men of age 65 
die within a year and that 60 of a particular group of 1000 such 
men died within a year, show that this group can not be regarded 
as a representative sample seeing that the actual deviation of the 
proportion of deaths is more than three times the s. e. of the 
proportion for samples of this size. 

6. A random sample of 900, taken from a large bulk ol 
mass-produced screws has 5% of its items defective. What infor¬ 
mation can be inferred about the percentage of defectives in the 

? [Ans. Between 0 064 and 0*036] 

7. An ordinary die is thrown 1800 times. If an ace or a 
two turn up 635 times, does this indicate that the true probability 
of an ace or a two with this die is not $ ? 

8. On March 1963 there were 66400 legitimate >ive births in 
a country, while in the whole year there were 740000 such births. 
Do these figures give evidence of the tendency of some parents to 
enjoy baby-tax concession for the whole year and yet keep expen¬ 
diture on the baby down to a minimum by timing his arrival in 
March, the tax year beginning in April ? 

9. If Pi t Pi are the proportions of individuals having a 

certain character A in two independent large samples, deduce the 
s. e. of ( Pi-Pi) and obtain a test of significance of the difference 
between the proportions, (i e. test the hypothesis that the two 
samples, assumed to be independent, arc drawn from the same 
population) [B. A Hon’s '55, ’69] 

10. In our country 100 men from a sample of 400 are found 
to be smokers; in a neighbouring country the numbers was 30J 
from a sample of 800. Does this show that there is a greater 
proportion of smokers in the first country than in the second ? 

[The proportions in the two countries are different). 

11. A civil service examination was given to 200 persons. On 
the basis of the total scores, they were divided into upper 30 per 
cent, middle 40 percent, and the lower 30 percent. On a certain 



42 


Mathematical Statistics 


question, 39 of the upper group and 29 of the lower group got the 
correct answer. Is this question likely to be useful in discrimina¬ 
ting ability of this type ? [B. A. Delhi Hon’s ’62] 

12. A civil service examination was given to 200 people on 

the basis of their total scores, they were divided into the upper 
30% and the remaining 70%. On a certain question, 40 of the 
upper group and 80 of the remaining group answered correctly. 
Is this question likely to be useful for discriminating the ability 
of the type being tested ? [B. A. Delhi ’68] 

13. In two large populations there are 35 and 30% of fair¬ 
haired people. Is the difference likely to be revealed v by simple 
samples of 1500 and 1000 respectively from the two populations ? 

14. In a random sample of 500 men from a particular 

district of U. P., 300 are found to be smokers. In one of 1010 
men from another district, 550 are smokers. Do the data indicate 
that the two districts are significantly different with respect to the 
prevalence of smoking among men] [U P. C. S. ’53] 

[Difference not significant], 

15. In a large city A, 20% of a random sample of 9C0 school 
boys had defective eye-sight. In another large city B, 15 - 5% of a 
random sample of 1( 00 school boys had the same defect. Is this 
difference between the two proportion significant ? 

11 A. S. ’62, M. Sc. Agra ’58. ’68] 

[Difference not significant] 


16 In a year there are 956 births in a town A, of which 
52 5% were males while in towns A and B combined, this propor¬ 
tion in a total of 1406 births was 0 -495. Is there any significant 
difference in the proportion of male births in the two towns ? 

[I. A. S ’61, B. A. Hon’s Delhi ’71] 


[Hint 0496 = "' Pl + + Pa 

1 * ni + n t 1406 

=> /> 2 =0-432. 


Consider Z= — 


(P1-P2) 


«/M*r *■;£-)] 


] 


17. The subject under investigation is the measure of depen¬ 
dence of Tamil on words of Sanskrit origin One newspaper article 
reporting the proceedings of the Constituent Assembly contained 
2,025 words of which 729 were declared by a literary critic to be 
of Sanskrit origin. A second article by the same author describing 



Theory of Sampling 



atomic research contained 1,600 words of which 640 were declared 
by the sa me critic to be of Sanskrit origin. Assuming that the 
simple sampling conditions hold, estimate the limits for the pro* 
poition of Sanskrit words in the writer’s vocabulary and examine 
whether there is any significant difference in the dependence of the 
writer on words of Sanskrit origin in writing on these two su' jects 

[I. A. S. ’47] 

[Hint. The estimate of percentage of Sanskrit terms in the 
writer’s vocabulary in both the articles combined is 


/ 729-4-640 \ 
”12025-j-1600/ 


= 37*77%, 
so that q=62 23% 

Then s. e. of the difference between the percentages 


e =>/[37'77x62-23 (.oh + i« W = l * 16% 
p 1= proportion of Sanskrit words in the first article 

20 2 6 X 100= 36% 

p t =proportion of Sanskrit words in the second article 
= x^§X'00=40% 


Now Z 


Pl2^?=4_ = 3-4>3 


1-16 


Thus the observed difference could not have arisen from 
fluctuations of simple sampling. 

Limits are 


[ 3 ,- 77 ±3 j r^r 1 )]°/ 0 =(”-77±3x- 8I)% 
=35-34% and 40 20%] 

18. A railway company installed two sets of 50 Burma ties 
each. The two sets were treated with creosate by two different 
processes. After a number of years of service it was found that 22 
ties of first set and 18 ties of the second set were still in good con¬ 
dition. Are we justified in claiming tha‘. there is no real difference 
between the preserving properties of the two processes ? [I A.S ’67] 


5 Some modifications of the conditions of simple sampling. 

1. Case : Change In the chance of success at each drawing 
Let pi be the probability of success at the»i th drawing. Then 


n 1 

E (x) =27 E(x,)=Zp,^=np t where p=-—£p l = E(p i ), •••(!) 

f=l n 

where x denotes the number of succusses in the sample (x/ 


44 


Mathematical Statistics 


denoting observed succusses in a sample of one memoer) Since the 

drawings are in dependent. 

6 » = V(x) = Z Vfa) = SPiqi = Zpi ~ Z pi 2 

/ i i 


Let 


Then 


V=W=%-p) , =4r (Pi-pY^zp.'-p' ...(2) 

FI | FI 


”o p 2 +np 2 =Zpr 

i 

Thus e 2 =£pi—Zpi i =np—n(p 2 +a I) 2 )=npq—na p 2 ...(3) 

Since V(x)=npq t when the probability of successes p rem¬ 
ains constant at each drawing, we conclude that in the above case. 

V(x) is less when the probability is constant and is equal to p 

Now <’’=y (—) =±y (x) = ± f i < 

\ n } n 2 n 2 »i n 

This case is called Poissons series of trials 

II Case. In N simple samples of attributes of n members, the 
probability of successes varies from samples to sample. Let 

Sample No : 1 2 ... / ... N 

Prob. of successes : p x p a ... p, Pti 

We wish to find the S. E. of the number of successes per 
sample, when the records of all the samples are pooled. 

Let Xi denote the member of successes in the it h sample and 
let 


P — 


1 


N 


Ep h 


Then 


E\x) EE(Xj)—Enpi — nNp, x being the number of successes in 

i 

the whole series of samples. 

*• ^ (^r expected value ol the number of successes 

per sample 

Now E(x l )=np, 

and E (.v, - np) 2 = E (x t - np, + n P , - np)* 

= E (xt-nptf+tf ( Pi - p y ...( 4 ) 

= nr.q, + <nr i — np)* 

e 2 = variance of the number of successes per sample 


>; 


E ( x,-np , 2 


Mif" 2 Z (Pi-P 2 )] 


-( 5 ) 



Theory of Sampling 


45 


f - * ^ p 

: 2 =-^ nNpq—nNof+n'Nof] 


B u t 2 Pi q t = Z Pi - 2p*=Np — N(o/ +p*) 

and Z( Pi -p)-~ No p z 

Thus c 

°r V [jj^=npq+n{n- 1) a/ 

Variance of the proportion of successes for sample 

V I i = — \npq-\-n{n— 1) a 2 


...( 6 ) 


= e' 2 


(-at )-*-[■ 


] 


pq . n— 1 


=— 4 - 
n 


n 


Gj> 9 - .. (7) 

The variances in (6) and (7) are greater than in the case of 
simple sampling with constant probability p . 

This is called Laxian series of trials. 

Ill Case', p remains constant , but the size of the sample 
varies 

This E{n ( )— h , E(n,—n) 2 
n * being the size of the Ah sample. 

If JT/ denotes the number of successes in the Ah sample. 
E(x,)=E(n t p)= n p. 

E ( x r- n P) t =E(x i ^np+np-npy=E(x i -npy-\-(np-np^ 

=npq+(np—'np) z . 

Now e 2 — variance of the number of sgccesses per sample. 

= E [”Pq+P{n—n)*)=npq+p*o n 2 . 

12*6. Sampling of values of a variable. 

Now we consider sampling of values of a variable and 
measurable quantity such as height, age, yield of grain etc. We 
choose n values of the variable from those of the distribution we 

fhe Imd r T T ,S samplin8 in which each n^ber of 

here are f h “ r Cha " Ce ° f bein 8 chostn - Suppose 

there are/, members of the population with value x, for each 

ft hus tl, e probability of choosing a member with value x, is 
ft-- If the distribution is continuous with relative frequency den- 

E.Wr ,he P robab i | ity that, in the random selection of an 
■Pd,v,dual, the variable will fall in interval dx isf(x dx /“. 

P{x ^X^x+dx)=f (x) dx. 



46 


Mathematical Statistics 


Simple sampling is random sampling if the probability of 
obtaining a value of the variable within any spcnfied range re¬ 
mains constant throughout sampling. In simple sampling tne 
system of probabilities assooiated with any drawing is indepen¬ 
dent of the results of preceding drawings A population whose 
distribution is continuous contains an infinity number of values 
in any finite interval in the range of the variable. Thus a finite 
random sample from a continuous distribution is a simple sample. 
The sample from a finite population will be simple when each 
value selected is returned to the population before the drawing ot 

the next value. . . 

Sampling distributions. The distribution of the variable in 

the population has its mean, variance, momeuts of higher °*‘ 
partition values etc., which are known as the parameters of the 
population Each simple sarnie from the population determines 
a frequency distribution of the variable, from which mean, v 
ance, etc. of the sample may be calculated. There are known as 
the estimates of the parameters of the population Any estima e 
obtained f.om the sample is called a statistic. Generally, any 
function of the sample values, u ed as an estimate of the para¬ 
meter of the population, is called a statistic. 

When the distribution of the variable x in the population is 
known, it is possible to determine the probability that the 
mate z of any parameter, obtained from a simple sam P ® ° 
members, will ie in the interval dz Let this probability 
noted by 4>(z) dz. The density «/>(*) detrrmines a probability distri¬ 
bution which is called the sampling distribution of the statistic z 
for simple samples of size n. Sampling distribution is a conn 
nuous distribution, which can be determined by the nature o t e 
population and the size of the sample. The S.D. cf the sampling 
distribution of the statistic is called the standard error (S E.). The 
sampling distribution is sometimes defined as the distribution of 
the values of the statis.ic obtained from an infinite number of simple 

samples of the given size. 

Distributions of statistics, for random samples from a normal 
population play an important part. If a random sample of values 
x,.x u is drawn from a normal population with mean p and 


variant o*, then 

n 




Theory of Samples 


47 


= e 



where xx »_ 

n 


=sample mean. 


This result implies that the mean of sample from a normal 

population is normally distributed with mean p and variance o 2 /n 
i.e. in symbols 


This Jesuit shows that the precision of a sample mean increases 
as the sample size increases, the sampling distributions of many 
of the common statistics approximate to the normal type as /»-><». 
so thot for large samples they possess the property that the proba¬ 
bility of a sample value of the statistic deviating from its mean by 
more than 3 times its S E. is veiy small. 

12*7. Sampling distribution of the mean. 

Let the sampled population have mean p and variance o*. We 

prove that the sampling distribution of the mean x, of random 

2 

samples of size n t has p for its mean p and — for its vaiiance. 

n 

We have 


^ (■*)—E j~C*i+■**+••• + x„) | 

~7-[ £ (**>+£(**)+...+£•(*„) 1 

E(x- h )t =E [JL (*,+*,+... +r n )- tt j‘ 

+ fx) 1 . 1, 

fince (Xi~fi)(x t — /a)=0 for i^j. 1 

=^r[ °* + o>+...+o»] 

_Wo* O* 

~n r ~~n' 

Tb . e ° r ^ 1 lfthe variabIe x has any distribution (other than 
ormal) with mean p and finite variance o 2 and if the m. g f exists 


48 


Mathematical Stctistics 


then the variate 


N <°>') 


as n tends to infinity. 
We have 


_ ut\/n f ty/n ,y 1 

a E l e g J 


= e 


t.Xi 


= e 


_H\'n n .... 

a /- 1 £ 1 c a V n ) 


• K-wr 

log M z (/) = —^>— t + n log Mr ( -4- ) 

g \ a\ n / 

= , +n log [|+M.' ^ +<V s* +••■] 

—'‘^" ,+n L ( - 2 k + -) 


t- 

-r+° 


Lim 1r*g A/ z (0_^1 
/j —> co — 2 


( 7 ») si 


-i (#« + -)‘ + ~] 


since /*/= mean = ^ 


Lim M z (t) r-/2 

— e 

n-> cn 


> Z ^ N( 0, l). 

The above result can be established by finding the various 
moments of the sampling distribution of x by using the additi\e 

property of cumulants. 

We have 

«.v=.Y 1 + .Y 2 + ...+r„ 

C ~ = C v _i_ v j- 1 v (0 = c (0 + C (0+... + C (0 

= log M (/) + !og M (0-f... -blog M (0 = « >og A1 x {t) 
x x x z 

=nC x (f) 

^n^kj+k* yr+...+Ar r )rr+-] ...(A) 

the cumulants k r being those of the population. 


nx 



Theory of Sampling 


49 


Now since 


rth moment of x=^ 


2 x 




rth moment of «x= .y 2 n r X r 

N 


r 4 * moment of nx 1 


or 


n r 

M n .(t)=E(e ,nX )* 

C_(/)=/t 1 <+^- £ + 
x' 7 1 n 2 ! 


N 


S x r —r' h moment of x 




kz < a , k. 


n 


3 ! 


+ 


n 


s 4 




[In (A) above replace t by tjn ] 

The rth cumulant for the distribution of the mean of the 
sample is thus found from that of the population by dividing 

by rf~K 

Thus E (x)=Ar,=mean of the population 


o* k 


2 


" n~ n 

and the third moment about the mean of the distribution = ^ 3 /n 2 
[recall k t =f* i =a 2 , /ca=/i 3 , k^—p-i —3/x a 2 J 

_ _ kjn* 

it*)* 1 * (**/")*'* 

[ v ^3 =kjn* t p»—kjn] 

1 


Skewness=y,«=»v / Pi — 


y/n 


(Skewness of the population) 


y 2 =P*-3 = 


1 


_M4 


Mi 


2 


. _ kjn* 
(kjn)* 


=— (excess of population) 

Thus for large samples from a population of moderate skewness 
and excess, the skewness of the distribution of x are small, and the 
distribution is approx, normal since for the normal distribution 
y 1 =0 and v*=0. Thus 

x N (ji, o*/n). 

Significance test. Since x ~ N (/x, a 2 /n), 

x — y- 


u— 


a I y/n 




N(0, 1) 


Suppose we formulate the hypothesis that a sample size n has 

been drawn from a normal population with mean »i 0 and known 
S. D. a. Let 

_ x Mo 
a/y/n • 


u. 


50 


Mathematical Statistics 


Then if | w 0 | < 196 the hypothesis is acceptable at 5% level 
of significance, and if | | ^ 1 96, the hypothesis is to be 

discredited. 

Fiducial limits for unknown mean. Suppose the population 
f om which the random sample of size n is normal with mean fx 
and S. D. o. Then as above the sample mean x is N (n, a 2 /n) 
variate. If we know a 2 but not p, there is a range of possible 
values of p for which the observed mean x of the sample is not 
significant at any specified level of probability. 

If observed x is not significant at 5% level of probability then 


*=2L| < 


1-96 


=> *_I 96 ~ < p < .v+I-96 7 - 

V" \rv 

x± 1 -96 are called 95% fiducial (or confidence) limits for 

the mean ot the population corresponding to the given sample. 
9 */0 confidence limits for the mean of the population are given by 

ST2*33 ~ n an d 99% confidence limits by .vT2*55 -2- 

Ex. I. A sample of 400 male .students is found to have mean 
height of 67 47 inches. Can it be reasonabily regarded as a sample 
from a large population with mean height 67 39 inches and S D. 1 30 

inches ? [M. Sc. Agra 61, BA. (Hod’s) Delhi* 63, 65] 

Here n = 400, *=67*47, ,1 = 67*39, a=T30 

The sample satisfies the conditions of simple sampling and 
therelore is the standard normal variate. Now 


| „ I •* P |_ 67*47-67-39 

' <s/y/n ] (1-30)/2U 

= 1-23 < 1 *96 

Hence the value of u is not significant at 5% level of probabi¬ 
lity and therefore the given sample provides no evidence against 
the hypothesis. It can be reasonably regarded that the given 
sample is drawn from a population with mean 6 7 !39 and S.D. 
1*30. 


Ex. 2 A sample of 900 members is found to have a mean of 
3-4 cm. Can it be reasonably regarded as a simple sample from a 
large population whose mean is 3*25 cm. and S. D. 2 61 cm. ? 

[B. A. Delhi ’67] 



Theory of Sampling 


51 


Ex. 3. Could a sample of S00 individuals with mean 4-38 
and S. D. 2'911 have arisen by chance-fluctuations from a popu¬ 
lation with mean 4*5 ? 

Ex. 4. A sample of 1000 individuals is 64 inches and S. D. 
is 3 inches. The mean of the sample is 63 5 inches. Is the difference 
significant ? 

Ex. 5. Mean of 10 readings on the length of a given rod is 20 
inches. The S. D. of errors of measurement is known to be 01 
inch. Does the result contradict the assumption that the length of 
the rod is 19*9 inches. [M. Sc. Agra *62] 

Ex. 6. A research worker wishes to estimate the mean of a 

population by using sufficiently large sample. The probability is 
95 percent that the sample mean will not differ from the true 
mean by more than 25 percent of the standard deviation. How 

large a sample should be taken ? [B Sc. Agra 60] 


[ 


Hint. | x~/x | 


\'96c 

y/n 


I X-,L I < 


=> n > 


<j 1 '96c f*_ 

4~ ** yfn 4 

16x(i-96)* = 62 nearly 


Ex. 7. A sample of 900 members has a mean 3*4 cm. and 
S D. 2*61 cm Is the sample from a large population of mean 
3*28 cm. and S. D. 2 61 cm. 

% 

If the population is normal and its mean is urtknown, find the 
95% acd 98% fiducial limits of true mean. {B. Sc. Agra 63] 

Ex. 8. It is known that the mean and standard deviation of 
a variable are respectively 100 and 10 in the universe.- It is how¬ 
ever considered sufficient to draw a sample of sufficient- size but 

such as to ensure that the mean of the ^sample would be in all 
probability within 010%<of the true^vdlue. How much would 

be the cost (exclusive ol < ‘©vcfhead charges) if the charges for 
drawing 100 members of a sample be one rupee ? 

Find the extra cost necessary to double the precision. 

[I. A. S. ’47] 


[Hint. 


1=3 
njy/n j 


*01 

10/ V n 




=3,000 => « = 9,000,0.0 


Charges = Rs. 90,000. To double the precision, —— =3 

=> * = 36,000,000 Extra cost = 360,000-90,000 

= 270,000 Rs ] 

Ex. 9. The guaranteed average life of certain type of electiic 


52 


Mathematical Statistics 


light bulbs is 1000 hours with a S. D. of 125 hours. It is decided 
to sample the output so as to ensure that 90 percent of the bulbs 
do not tall short of the guaranteed average by more than 2*5 per 
cent. What must be minimum size of the sample ? 

f/j=4l approx.] [B. Sc. Agra ’64] 

Ex. 10. An unbiased coin is thrown n times. It is desired 
that the relative frequency oi the appearance of heads should be 
between 0-49 and 0-51. Find the smallest value of n that will 
ensure this result with 90% confidence. 

[Hint. Z = P -——' , p being the observed proportion 

2^T 

P {| Z|<l-645}=0-90. Hence <1-645 

2vvr 


=► the limits of p as 0-5=F^^. Now —0-51 -0*49 

2 yj n ^ // 

=> n=6165] 

Ex. 11 . I f p i s the observed proportion of success in n inde¬ 
pendent Bernoullian trials, prove that the 95 % fiducial limits for 
the population proportion p\ for large samples, are 

p±v96 M 

Also show that 99% fiducial limits are the roots of quadratic 
equation 


(P-P 2 ) (2-58)2= nip’-py 

[M. A. Patna ’56] 

[H,nr. p ■ =p±2-51 => n (p’-pf=( 2-38)* (p-p')] 

12 8. Test of significance for the difference of [means of two large 
samples. 

We are given two independent simple samples ofn, andn, 
members with means x l and x 2 recpcctively. Can the difference 
Aj—be ascribed to fluctuations of random sampling, it being 
regarded that the two samples have been drawn from the same 
population of S. D. c ? The S. E.’s of the means of the two 

samples are and Now let 

«*= y (*!-*.)= V (5,)+ v (S„) 



Theory of Sampling 

Then Xl ~x.J=lN(0, e*) if n x and n t are large. Therefore if 

*,^*2 > 3 *, 

it can hardly be ascribed to fluctuations of sampling; and our 
assumption that the samples have been drawn from the same 
population is almost certainly wrong. If 3c 1 ~X a >2«. the difference 
is regarded as significant at the 5% level of probability. If 0 t Q f 
the population is unknown, it can be estimated from the combined 
sample of «i+« 2 members, unless the variances of the two samples 
are inconsistent with the assumption that they have been drawn 
from the same population. 

If the two samples are known to have come four different 

Populations with variances <*• and <*», we can test by a similar 

procedure whether the two populations have the same mean. Just, 

as above, let the sample means be and x 2 with sizes n, and n\ 

respectively. We test the null hypothesis that the two populations 

have the same mean. Since *, and x 2 are normally distributed 
when w, and n 2 are large, we have 

E —x % ) => 0 

and £ a = y (Xt-x, ) = V (x t )+V (x 2 ) 




n, 




that is, (0, «*) variate. 

If l *i~* a |>? € , the difference can hardly be ascribed to the 
fluctuations of random sampling and our assumption that the two 
populations have the fame mean is almost certainly wrong. If 
t e variances of the populations are not known, they mav be 
replaced by their estimates from the large samrles without serious 
error. The estimates of and o a «, are given by 


5 ,* 


1 


n , 


n i , 

■ 2 i,»„JI 


n. 


n 2 

£ (*/—*.)» 

7*1 

are '.'d' 10 be tested «h«t the population means 

are p. and ,x, the above tert can be applied by considering 




which is the N (0, 1) variate for large n, and n„. 

v-ri” 0 '* : ~ If * he parent populations are normal with known 
variances, the above test of significance is valid for all sample 

Ex. 1. A simple sample of heights of 6.400 Englishmen has a 


54 


Mathematical Statist id 


mean of 67 85 in, and a S. D. of 2 56 in , while a simple sample of 
heights of 16C0 Australians has a mean of 68’55 and a S. D, of 
2 52 in. Do the data indicate that Australians are on the average 
taller than Englishmen ? 

We test the null hypothesis of equality of mean heights with 
the following data : 

/j 1 =6400, *2 = 67*85, ai=2*56 
n # =l600, *, = 68 55, <xi=2'52 
Therefore the standard normal variate 


x^x 2 _ 68*55—67’85 

H?l+ C £\ a // (2-56)» , (2-52)* 
aJ \ w i n 2 ) aJ\ 6400 16U0 

_ 0*70 

•v/[(0*032) 2 -|-(0 , 063) 2 J 

_ 0*70 0*70 

V (0*004993) 0 07 

= 10 which is greater than 3. 


Hence the data are inconsistent with the assumption that the 
means of the two populations are equal; and we conclude that 
Australians are on the average taller than Englishmen. 

Ex. 2. A potential buyer of light bulbs bought 50 bulbs of each 
of two brands A and B. Upon testing these bulbs , he found that brand 
A had a mean life of 1282 hours with a S. D. of 80 hours, whereas 
brand B had a mean life of 1208 hours with a S, D. of 94 hours. Can 
the buyer be quite certain that the two brands differ in quality ? 

Suppose the buyer thinks that there is no significant differ¬ 
ence in the quality of bulbs of the two brands. The data provides 

n,=50, 5, = 1282, 5*2 = 80 
w 2 = 50, *0=1208, S a = 94 


Therefore 

i*— _ 1282—1208 

= ^i = 4-23>3. 

that is, the difference * a =74 is highly significant. Hence we 
conclude that the bulbs of the two brands differ in quality and the 
brand A is superior to brand B. 

Exercises 

1 , The means of simple samples of 1000 and 2000 are 67*5 


55 


Theory of Sampling 


and 68*0 inches respectively. Can the samples be regarded as 
drawn from the same population of S. D. 2 5 inches ? 

{Agra ’52, 63] 

2. Random samples of 500 and 400 have means 11*5 and 

I0‘5 respectively, Can the samples be regarded as drawn from 
the population of standard deviation 5 ? {U. P. C. S. ’62] 

3. If 60 new entrants in a given university are found to have 
a mean height of 68 60 inches and 50 seniors a mean height of 
69*51 inches; is the evidence conclusive that the mean height of 
the seniors is greater than that of the new entrants ? Assuage the 
S. D. of heights to be 2*48 inches. 

4. A random sample of 100 farms in a certain year given an 
average yield of barley of 2000 lbs per acre with a S. D. of 192 
lbs. A random sample of ICO farms in the following year gives 
an average yield of 2100 lbs. per acre with a S. D. of 224 lbs. 
Show that the data are inconsistent with the hypothesis that the 
average yield in the country as a whole were same in the two 


years. 

* • f 0 

5. (a) Intelligence test on two groups of boys and girls gives 
the following results. Examine if the difference is significant : 
Girls : Mean = 84 S. D ^=10 No.=:121 

Eoys : Mean=81 S. D.= J2 Number=81 


{U. P. C. S. ’43] 


5, (b) Two populations have their means equal, but S. D. 
of one is twice the other. Show that in the samples of sire 2000 
from each drawn under simple sampling conditions, the difference 
of means will in all probability, not exceed 0* 15cr where a is the 
smaller S. D. What is the probability that the difference will 
exceed half this amount ? [B Sc. Nagpur ’63] 


{Hint, 

Urge. 



N (0, 1) as the sample size is 


Now » I =2000=» S gives zJ-5-=£sJ 

U'U j(j 

I Xi-X, I < 0-|5a =*• I Z I < 

p [i i > jxo 1 5-3] => p nMj>. | 

=> />[|Z|>1]=I-I*[|Z|< j] 


56 


Mathematical Statistics 


=i_ Lr 

6 . A random sample of 200 villages was taken from Gorakh¬ 

pur district and the average population per village was found to 
be 485 with a standard deviation of 50. Another random sample 
of 200 villages from the same district gave an average population 
of 510 per village with a standard deviation of 40. Is the differ¬ 
ence between the averages of two samples statistically significant 7 
Give reasons. [U. P. C. S. ’49] 

7. Samples of students were drawn from two universities 
and from their weights in kgm, means and standard deviations 
are calculated. Make a large sample test to test the significance 
of the difference between the means. 


University A 

Mean 

55 

S. D. 

10 

Sire of the sample 

400 

» B 

57 

15 

100 


[B. Sc. Delhi ’67] 

8 . Id a survey of buying habits, 400 women shoppers are 
chosen at random in super market *A' located in a certain section 
of the city. Their average weekly food expenditure is Rs. 250 
with a S. D. of Rs. 40. For 400 women shoppers chosen at 
random in super market *B' in another section of the city, the 
average weekly food expenditure is Rs. 220 with a S. D. of Rs! 55. 
Test at 1% level of significance whether the average weekly food 
expenditure of the two populations of shoppers are equal. 

[B. Sc. Bombay ’68} 

9. If*,, x 2 are the sample means of two independent samples 

° slzes Wl » "* from normal populations with known variances <r 3 

V, prove that the 98% fiducial limits for the difference between 
population means are 


■*1—.t 2 ±2 - 326 



How will the above limits be effected ? if 

(0 the populations are not known to be normal, 
(ii) the variances are not known. 


(Hi) Confidence coefficient is 95 %. 


Theory of Sampling 


57 


10. The guaranteed average life of a certain type of electric 
light bulbs is 1000 hours with a S. D. of 125 hours. It is decided 
to sample the output so as to ensure that 90 per cent of the bulb 9 
do not fall short of the guaranteed average by more than 2'5 per 
cent. What must be minimum size of the sample ? 


[Hint. 

hence 


x> 1000—2 5 of 1000=975 
._x-p 975-1000 


Z= 


-y/n 


[ 


P\ Z> 


ofy/n 125/-V /n 

V 


-)-•> 


( 


90 or P 0<Z< 


f) 


0 40 


=> = 1*28 or n =41 approx.] 


11. It is known that 40% of a population die within a year, 
30 persons are vaccinated of which only 5 die in a year. 1 he 
inventor of the vaccine claims that the vaccine has reduced the 
mortality by 2 %. Examine the following : 

(a) Justify his claim. 

(b) Discuss whether the vaccine is effective at all. 

(c) Find the least value of n such that if n /6 die after vacci¬ 
nation, his claim is definitely accepted. 


[Hint, (a) Z 


0 - 32 - 0*16 


J( 


0*32 x 68 
30 


■) 


>10 


claim is rejected 


(b) 

(c) 



vaccine is effective 


The least value of n is given by 


a 


32 x -68 


n 


T 


Th® mean of a certain normal distribution is equal to 
me a. is. of the mean of samples of 100 from that disiribution. 

Find the probability that the mean of a sample of 25 from the 
distribution will be negative. [M. A. Punjab 1952] 

[Hint. Let the normal distribution be N (*, a ? ). By hypo¬ 
thesis 


M “V(ioo) f 


- a 

x "To . 

Now Z=-is a normal variate. 



Mathematical Statistics 


.*. * = + ^ and hence .*<0 => Z< — \ 

Required probability 




\/( 2 ”)\ 


— it x 2 , 
e " ax 


-oo 


V(2n)\ 


e - JC= rf.v=0-3085] 


-/V(0, I) 


12 9. Test of significance for the different of S.D.'s in two large 
samples. 

Suppose we have at our disposal two large random samples 
size and n 2 and in which the S.D.’s are Si and St. Can the 
difference S t — Si be attributed to the reason of their being taken 
from two different populations with S.D *s o l and o 2 or can it be 
attributed to the fluctuations of sampling ? 

We make the null hypothesis that the two samples have been 
drawn from normal populations having the same standard devia¬ 
tion, that is a x =n 2 =o under the null hypothesis, the statistic 

Z= V(var (s[-S5~ N <0 ’ 11 
since the samples are large. 

Now E(S 1 -S 2 )=E(S 1 )—E(S 2 ) = 0 

Var (5, —5 t ) = Var (S,)-f var (S 2 ), 

g ) 2 

~ 2 //, ?n 2 

Hence z ° —" (0,I) 

A/ \, 2 w 1 ^' 2 rj a ) 

The difference Si — S 2 will be regarded significant at 5% level 
of probability, if | Z | > 2. 

Ex. 1. Random samples drawn from two countries gave the 
following data relating to the heights of adult males. 

Country A Country B 

Mean height in inches 67 42 67 25 

Stand deviation 2 58 2 50 

Number in Samples 1000 I2f0 

Is the differences between the standard deviations signifi¬ 
cant ? [B. Sc. (Hons.) Delhi 68] 

Ex. 2. The S.D. of a simple sample of 2000 members is 5*9 
years, and that of an independent sample of 2500 members is 6 1 
years. May the samples be reasonably regarded as from the same 
normal population ? 


Hence 


■N ( 0 , 1 ) 



Theory of Sampling 

5 9 

that th ' sam t s are 

approx. The S B. of ,he difference of *“* 

6 a/( d000^“500b) =6 C002l2)=0 127 years. 

Thus 7 — 61—5 9 

012? -* 6 - 

a^'^sy level of „? h °' 2 year is " 0I si S" fica "‘ 

agains, ,hf hypolhelj ‘ '' " ' hUS D ° real evid<:nce 

12 c . S “ mpllng { rom »finite population without replacement. 

Let the population values be 

y " y *. . with 01630=,*, and variance^ 2 , that is, 

y= ~N ° ! =^- f 0 ’/-p)*= ~ Syt-f 

plaeeme P n P t 0 Se rVe and0m Samp ’ eS ° f " Values is drawn w ' ,hou « »■ 
pacement. The representa.tve sample value *, can assume any 

°oe of the values y 2 . y N with P robability=— 

Here /=] 2 l „ 

J * 9 +*9 O t • •• y fl» 


Thus 


E M-Jf +•••+>’« )*»ft 

a-'=E(X-p.)* 


*=„« ^[(* 1 — fO+(*a — p) + «.+(jf B —/x)J* 

j r fl 

~^l r . EitJ-W+lZZ E(xj-p)(x k -p)\j^k 

J “ 1 j k J 

£(*i—n((* a -/i) J" 

Now£( W) , -^[^-^+0^, + .. 

• + (*. = ^ 


K-^i /i)(A : 2 ,x)] ( v S E f*), 

v 7 /= I J= I 



60 


Mathematical Statistics 


=~/v ( 7 v ~ iy . s O’*—/ 4 ) [Oi—^)+0 , 3 -/ i )+...+On— /*) 

—(>'/—H-)l 
N 

= r O'.—/x) [ 0 —m)j 

1=1 

-1 ^ „ -Afo* _ 

“ A(AT-1) ^ N(N-l) N- 1 

Thus Var «=J- o2+ ti(_^) 

(T 2 —« 

= V * A^T • 


If A' is large, then 
Var (*)=— 

v rr 


A "— n 

• TV—I 


fl 2 1— rt/A <J 2 

. 1 ~~ n 

A 


Exercises 


Ex. 1. A box contain sA tickets bearing )\ y z% ..., v N and n 

tickets arc drawn at random. If s denotes the sum of the numbers 

drawn, prove that 

„ . ,, , . A r —n 

E(s)=twi , Var (s) = o~. 

where m is the mean of the numbers appeapng on all tickets in the 
box. 

Ex 2. Show that the sampling variance of the proportion of 
males in a rand >m sample of n people drawn from a population 
of A' is given by 

A ' —n p( 1 — p) 

N— I * n 

where p is the proportion of males in the population. Obtain 
also an unbiased estimate of this sampling variance in terms of 
sample estimates. 

IM. A (Bom ) ’52, B U. ’57, I. A. S. ’48] 
Ex. 3. From a finite population of size A aad variance 

1 " 
i 

or rand >m sample of n is drawn without replacement. Show that 




61 


Theory of Sampling 

the arithmetic mean and variance given by 

1 n i n 

— 2 Xi and - r 2 (x,—x) 2 , 

n { n 1 , 

are unbiased estimates of the corresponding population values. 

[The statistic t=*t (x„ x t , ... t x n ) is called an unbiased estimate 
(or estimator) of the parametei 0, if E{t)—6] 

1211. Standard error of apartition value. 

Suppose we are given a continuous probability distribution by 
P (x < X < x+dx)=f (x) dx y — oc < x < oc 

A large random sample of n members is drawn from the above 
population. We consider a 

partition value x p of the popu¬ 
lation such that 

OO 

f(x)dx, 

x P 

X p 

?=J f(*)dx,[p+q= 1] 

— OO 

Let the corresponding partition value for a large sample of n 

items be x p -\-8x p . In the sample the relative frequency above 

Xp-\-8x p is x and above x p it is p-\-8 p . Thus S p =relative frequency 

in the sample for the interval 8xj. For a large sample 8x p and 8 P 

are small and 8 P is the relative frequency for the interval 8x p in the 
population. 

Therefore 8 p =f(x p ) 6 x p =y 
&x p where y is the ordinate at x p 
for the frequency curve of the 
population. 

. (8 p )*~y* (8x p )*. 

y is independent of the sample 
and 

E{8x p )~ 0, E(8 P )=0 

Hence 

E VX'V-jT E (Sp) 2 

1 

“V 




E[p-E{p)Y 



62 


Mathematical Statistics 


or e '=Elx r -E(x I ,)]‘=' r V(p)=p. P -? r 

This is the required S.E. of the partition value x P . 

If >» is not given it may be estimated from the frequency 

distribution of the large sample. 

Suppose the given continuous distribution is the normal 

distribution. Then 

x 2 


dF= 


1 


— I 


(lx, —OO < X < CC 


<?V'( tt) 

Then the S.F. of the median for a large sample of n members 
from a normal population with S.D.a is 


_ 1 l[PJL\= _A _ l( 

f{x p )/J\n I f(x p ) J \ n ) 


where — = 0, p—\, q=\ for the median. 

a 


The table gives the values of — 


1 • — 1 


Y 2 


•> 


which is equal to <r« 


er\/( 2 n) 


% 

e a 2 


of (*) 


~ X P A 1 _ h - 

For — — 0. —r- e a 

r. 


1 


f(Xp)^ 


v (2") 
•39. 9 




n 


ThuS£ =-39lr^r*)- 125 vt 

which is 25% greater than the S. E. of the mean. 

For the upper quartile p=\ and the area from the mean to 
the quartile is 25, giving 


x 


= 0 6745, a f (.v) = 0'3178 


from normal probability tables. 

Thus the S.E. of the quartile is 

10*1 

£ “V3I78 J\ n 



63 


Theory of Sampling 

Standard Errors of Statistics. 

11 12. Standard error of class frequencies. 

Suppose a population is divided into Ar classes and the mid- 
value of the ith class is Let the relative frequency of the discrete 

relative frequencies of all the classes other than the ith class by 

Vi and therefore Pi-\-qt = l. 3 

A simple sample of n members is drawn from this population, 
me relative frequencies of the classes vary with the sample, fluc¬ 
tuating about the corresponding values for the population. 

Suppose f is the frequency of the /th class, that is, of the value 
X/ ,n a sample of n members, then 

. . ^ (f) expected value of the number of successes in n 

ependent trials with constaat probability p, of successes. Since 
Ji is tho number of successes in trials of constant probability p { 

v ar ( fi)=np,qt 

' h ™ E W-£lf‘~«fdr-V*r (f)=»p iqi =np, (1 - Pl ) 

re the symbol 8 is used for the daviation of the variable from 
ns mean. 

,f Pl ,s unk nown then estimate of it from the sample is 

ft . f 

T'l e Pi ~ ~ 

Thus E (8 f)* ~f ( 1-4) 

II is a fair approximation, if n is large. 

by An unbiased estimate of the sampling variance off is provided 


We have 



E (,f*)~E If -E (/,)] + !£■ (f t )f 
s = n PtQ,-\-{pptY 

Hence E [/-•£]= E E (/ , a) 

= npt —ptqt — npt 2 

~ n Ptqt—ptqt= (/i— 1) ptqt 

n— 1 n — 1 

= — "Mi - — W 




64 


Mathematical Statistics 



If n in large -1 

Covariance of the frequency in different classes. 

rs/= r [/;—£(/;■)]= 0 

/= 1 / 

for samples of given size //, since the sum of the class frequencies 
in any sample is constant. 

Cov (f it fj)=E[(f-n Pi ) (f -npj)\=E M Vi) . 

In a sample in which Zf, has a fixed value, and -5/) is distri¬ 
buted among the oth;r class frequencies, on the averages, in pro¬ 
portion to their expected values. Thus out ot n (1 —pi) members 
npi belong to the yth class and so the proportion of -hf to be 
attributed to /'ih class is 


"P> 


«/i. 


/J (1 — Pi) 

that is to say, the average value of is 


Tf > =-T- Pl sf - 

In calculating the covariance we may replace each value of 
8 /, in the class by its mean value hf, . Then 

mv,) — £ £ 

— — "PiPi* 

f fi , 

For large n % P>^~* Pi *** s0 


E (5/ hf) 


fif, 

n 


Further. 


E (if Sfj)=E [f-E(f)] [ f,-E{f ,)] 

= £(/, /,)-£■( A) E{ /,) 

E(f/) = E(Sftl/,) + E(f) E(f) 

= - "Pi p i + 

=n (rt— 1) Pipj=— (w—1) E(bfi hf,) 

is the unbiased estimate of the covariance. 


n — l 

When n is iarge. n — 1 — n. 



Theory of Sampling 


65 


12*13. Standard errors of Sample moments about a fixed 
value. 

y. r denotes the rth moment of the population about its mean 
and fi' r denotes the rth moment of the population about some 

other specified value. 1 he corresponding moments for tha sampie 
are denoted by m r and m' r % 

We have for the rth moment of the sample about .v=0 

k 

n m' r — 2 f x? 

7=1 


Since &fi=f — np it the value 8f { fluctuates about np t . We 
denote the rth moment of the sample about ;t=0 from its mean 
value by 

S/n',=m' r -E(m' r ); 

hence n h m ' f — 2 xfhf 

k 

=> 77 2 (Sm' r ) 2 = 2 x t * r <«/,)*+ 2 Xl * xf 8f 

since the value X{ are the same for each sample. 

Taking expected values 

n* E(8m' r y=Z xS' E(8fY + 2 xf xf E(8f 8fj) 

i^j 

= 2 x t u nptqi — 2 x t r xf np t p, 

m 

nE(8m' r y=2,piXi xr —2Pi 2 *< ir - 2 p iPj -x / xf 

i= £j 

=2 PiX^-iZ p t x/)P 
=> E(&m' r )*t= i- —|i' 2 r ) 


If we substitute for 

E tw ~ £- fl [i-A.) 
and for *r/*r *r\ _ -fifi 


m< *fj) 


n— 1 


the unbiased estimate of the sampliug variance of m', in terms of 
the moments of the sample is given by 


1 


-~2 i (m'ar-mV) 


E (hm' r y - 
m\ =-i- 2 fXi=x t we have 


Since 



Mathematical Statistics 


Var (*)=£ (8*)* = — (/a —mV) — ~ 

where a 2 is the variance of the population. 

In terms of sample moments 

Var (S)=£(*x)’-£[S-£(*)] J =7^7, 

- Hr ***-*)- 

-7[i=l r/, “' W i 

-=^-, where £(j 2 )=g 2 , 

that is, s i the unbiased estimate of the population variance. 

Now we find the standard error of the Cor (m'r, '«'») 

Cov (m'r, m , ,) = E(&m' r Hm\). 

We have 

nm'r^Zfi V => n 6m' r =Z x,' hf, 
nm't—ZfiX* => n bm'^Z xf & 

Multi plication yields 

n*8m' r Sm'.=£ x.'« (S/i)*+ £ x.' x/ 6/. 6/, 

i 

d „> Eibni'r Sm\)=£ x/» £(«/,)*+ £ x,' */ £(V. Vi) 

i /?y 

x," 1 np, (I £ x/ x,‘ np,p) 

i <^j 

» n E(6m' r , «/« .)-£ p, x,' +I -£ P. 2 x.'+'- i? P.PjVx,’ 

i <*/ 

=(*'.«-(£ P.x,*)(£ P;X/) 




— f l f+J — 'P 1 

Cov (ro r \ m',)= — (fi'r+»—/iV'j) 


The unbiased estimate of Cov (*»'„ m',) in terms ot sample 

moments is obtained as above, after substituting for 

/ /* \ 


£(8/ 4 ) 2 = —i fi 


(■-f 


and 


£ (5/, 5/)) 


f'Jl 


n -1 


It is Cov (m\ (w r+j—m r m ,) 


•• a 

Note. If the population frequency distribution is ungrouped, 
we proceed as below. 



Theory of Sampling 


67 


«''= ~ 2 *i\ 

n i 


^ (mV)=£ (4- f v K L **f 

Var (m' r )=£(m' r -^' r ) 2 =£[-i £ J’ 

? £ [ f x ‘ ,r + 2 .Z.Xi'xf-2nn-, £ v+n* ] 


>[ f ^ ,r )+ 2 ^ *(*,*) ] 

H »f*V+2- fiV-nV',* J, 


«* 


2 

/i 

1 


[*p'*r— np' f *] 
[p\r-p\*] 


since x/ s are independent. 


12 14. Standard error, of Che rarianee and standard deri.tlon of . 
large sample 

For ^stm 01 ) 6 that , lhe mcan of a sample changes with the sample. 
Vl a sample we have 

™t~m 2 —x 2 

=> Sm z -8m t '—2xBx 

** (8m t ) 3 = (8m t y—Ax8x8m z ' -f 4x 2 (§*)« 

where U-±2xM, S<=±Sx,'8/ t 

uppose the mean of the parent population is zero and hence 
£ W —Pi =0, and 8 *«=*—£ (x)=x 
° n this supposition 

E(Sm,y= 1 J- 0 *,-^ 

A,SO * (*S)*=(8S)> («*)*=(**). 

SmC * £ («*)*«=&-, this implies £($*)• is of the order — 

n *' 



68 


Mathematical Statistics 


E (a- 53c 5m 2 ')=E (5a* 2 8m.,') is of the order ^ 

If „ is large, then E (Sm 2 ') 2 is large in comparison with those 
of the second and third terms of (1) 

Hence E (8m*) 2 — ~ 0*4-“ Ms 2 ) •••(^) 

For a normal population with S. D. r, 

M4=3^, /i. = ^ 2 

Hence for large samples from a normal population 

9 r 2 

E (8m*) 2 ~ — 

n 

The S. E. of the S. D. of large samples is worked out below. 

Let m,= J-r/ (x-x)- = S-. 

=> 8w : = S 55 

For a large sample S^ r r, 55 being small [E (5-') = ^-, if n is 

large] 


Thus 


E (5w 2 ) a = 4<7 2 £ (55)* 
Mj — /‘a* 


/•; (85V 2 — 


Ar~n 


(Note 


For large samples from a normal 

3o 4 - CT 1 C“ 

w = -4^r = 2^ 


a- = n* for the population) 

population 


^ S. E. of the S. D. of a large sample = 

Note I. Standard error of moment about the mean of a large 

sample is calculated as below 

We have 


nm r — £ ( a , — x) r fi 

— E x, r fj — rx E x, r-, /.-F."» 

=> 8m r =&m'r— r8xm’r-i — rx — 

If the origin is taken at the mean of the population, x becomes 
identical with 5 a As we need retain only the terms of the lowest 
order, we neglect those after first two On squaring we may write 


the result 

(6m r ) i = (6/HV)HrV'*r-i(^) 3 - 2r ^- l6xS,M ' r, _ 

=> £(5w r ) 2 = E(8m' r ) 3 + r V* 2 r-i E (cx) s -2r/ r -i E(6a* 8m ' r ) 

Taking the origin as the mean of the population this yields 

E (5m r ) 2 = - ((*2r— + o 2 —2r Mr+ i Pr-i) 



Theory of Sampling 


69 


[ We have used jhe results £ (8m' r )»=— G*'„-(*'»*) 
and E(Sm’ r 8m',)=-i- 

Note 2. If we compare the S. D.’s S 1 and & of two large 

v a riancethen ” 1 ^ "* ^ Same n ° rmal P°P u,ation with 

E (Si—S 2 )=0 


and 


= v( Sl -s 2 )=i<,* H-L\ 

"2 /• 


Therefore if 


§ _ g 

~ L T~ I < 1 ' 96 . the difference S t -S 2 is not regar¬ 
ded significant at the 5% level of probability. Thus, by comparing 
be actual difference with the S. E. «, we test as usual rte cfedibi^ 

popula^n yP0th6SiS ' hat the Samp ' eS WCre dfawn from the same 

5 9 years'anl^' ° f V irnp ' e sa “’P'« of 2,000 members is 
is6^ ».’r d w u ,nde P end;nt sam Ple of 2,500 members 
6 years May the sample be reasonably regarded as from the 
same normal populalion ? 

S n EX ’^ In 3 samp,e of 10 ' 000 from a normal population the 

cm ,S , 2 Cm ’ Sh ° W lhat ,he S E * of lhis quantity is 0 018 
cm nearly, and hence that the S. D. of the population almost 
certainly lies between 2 s 6 and 2*58 cm. 

1^°? 3 ’ F ° r fhe un 8 rou P ed frequency distribution we calculate 
var (ro a ) as follows : 

We have E ( m ^ E \'± £ ], — L r „ 

L n i n 

== ~^' & [E *<*— 2 m x ' Z ATi-f/rm,'*] 

“T E ir *‘‘~ 2 ~rT Xl) *<)+«.^ V *.)‘l 

1 IE x ,»--L (27 *,)»[ 


n 

1 


1 


- £ [27 xf—j- (27 x,>+2 27 *,*,)[ 

&J 

‘ • [27 £• (*,») — L 27 E (X,*)—\ 27 £ (*,) £ (*,)] 

It£J 

{Sioce *’s are independent E (x,x,)=~E (*,) E (.v y )J 


n 



70 


Mathematical Statistics 


f . , I ,2 /* (/i-l) n 
— («#*s - W* — „ 1.2 1 


1 


^2 




— 


»—J 
n 


Mi 


(V 1 ) V 1 ^ 


Var 


(m,)=E £ 


m 2 - 


w— I 


-I 


E |m a =— 2 m. M2 + ^'-i— 1 ) /V J 

= £ ( m ,*)-2 (^')’ ( x 3 *+('^- 1 )’ 

= £■ ,V- 

Now we shall evaluate E (m a 2 ) 


-27 jc,* 


27 */ 2 


-] 

-£■ <* ">']’ 


[ 


-■=[4 
-[•* 

= [4 < r *■*>' ^ *’xr *-)*+4 *■»' ] 

-^-(2’x,‘ + 2 r x ‘‘ + 2 ZxW+2 Z xi'xf 

n i=Aj n i^j m 

+ 2 Z xfxix k )+-l r (Z x,*+6 Z xi*Xj* + 4 Z xfxj') 
i=£j=£k ^ ly£i i^j 


Hence 


E < w * 2 ) = ,ir s L 27 E (*/*) E [2; E (JfiO 

+ 2S (*,) 2 E (.v/) a 4-2 27 E (x, 3 ) E (x,) 

'^j 

•f 2 27 E (x, 2 ) E (x,) E (x*)] +i- [27 E (x, 4 ) 

i^£j=±k n 

+ 6 ^ f (*r) £ (*/*>+4 27 E (x, 3 ) E (x,)J 
faking E (x / )=^ 1 / =0 t we get 

* f“+^- ■—7 _,) 


=!>„.-! [„. +! qtri) „. ] 


4- 


1 f , * w fn - 1) - 

,7T [ ,; ^4+6 ^ 


1.2 


] 



Theory of Sampling 

( 1_ ir) + ^ n -^ (" 2 -2"+3) 

Thus 

Var W-fcH 1 , 4+ ^ (n 2 —2n+3) 

= («-l) 2 (n |) (;;—3) 

O* ^ ^3- / 4 » 


71 


yy 1 

„3 ^4-1 — 2)/V 

(n-lf 2 («—1) 

^ 0 * 4 “lV) + - ^3 ~ /it 2 - 

Var (m*)=- (p t —p t ») to order -i- and that for the nor- 


3g 4 — 
n 


o*_ 2 a l 

n 


mal parent population 

Var (/«,)= 

Cor - '*-dn * <*(-*)»«• 4- r (,,-*, 2 =^ 

Heace Var (, 2 )= Var (2& ) = (j£. )* Var ( m> ) 

I 2 1 

2a 4 

| f° r the normal parent population 

sample ofn^f a 1 * ^ 2 ’ or<? p »statistics calculated from a 

ample of n member.s and if each y it i=\,2 . . has mean i and 

its variance is of ihe order L then 

n * 


v " </) - JMT 

'•dl'd/riviuved' ^ f unc,ion °f V’s and possesses par - 

y p )=f i^+y, ^+^7 = 1J) 

-/ui..... ?,)+[ ^ to-w JJ 7 +... 


?»)+r to-?,) f/+4i O',-?,)' 


By Taylor’s theorem. 


+ 4 to-«,)to-w^-]+- 



72 


Mathematical Statistics 


Var (yi ) is of order -i- => (>’'“50 00 ^ 

Hence neglecting terms of order higher than we have 

/0-,.yr)=/(?!. ? 2 .[(>,—5,)|£] 

=> f (yi. yp)—f (Si* Ss»***Sj>)—^ [(* Si) 

=> £ [/Cvi. yv)—f& ..S»)] a=£ 

* Var (/)=£[ *<*-«• (§£)* 

+ £ Oi-W O’j - 5')|£|£] 

“ f &{)’ Var w+ 4 j a¥M Cov (>1 ’ 

Let /,=/, (>-, }’p\f,=f, 0>» yv) oe two functions of y's 

then 

Cov (/„/*)=£ [ f\ (Vi***** >v) fi (^i»***» 5^)] t_/s* (>’i»*«*» 

fz (^i***** £*)] 

= £ [ f( ,,-s ( )|][jO'.-s, ) |;] 

= i f|- g* Var (>■,) 1 + / [| | Cov („. „) 1 

/r^j J j^fcj i C ^OC,i J 

Application of the above theorem for calculating Standard 
errors of Sample moments about mean 


m r =— 2 7 (Xi—m^Y 

4^ [ **'- ( [ ) *t~* "<•' + ( 2 ) <’ 

+( —l) r *»,'*'] 

=m’ r — r c 1 m' r . l nil l) r /w 1 #r 

=-/(m',, m'r-i,..., mi') 

r F) f 

•=f l‘r-ly-* f l l)+ 27 (W/' — /X/) p— 

1 = 1 W 



Theory of Sampling 


73 


NOW f(p r , Pr- !>•••» Pl^Pr—'Cl Pr-l /*i + -+(—l) r f*/ 

0fi r ‘ * 3f*i 1 / " r ~ 1 

Therefore since m r ' is distributed about the mean with vari¬ 
ance of order —. 

n 

o /* /* 

m r=f (p r > Hr-l*•••» f*l) + (Wi'-Ml) .^ + (Wr n r ) 

=f(Pr, Pr- 1. — » + —Fr>—( m l'~^l) r / A r-1 

=> Var (m r ) = £- [(m/—/i r )-r K-^) /i r _,p 
-£■ [(m/-|i r )*+r* ^ r _ x 

—2r /i.-j (m/-/x r )] 

=Var (m/)+rV*r-i Var (m/)— -r p r - x Cov (m,', m r ') 

= -^- W —f*r*) + P^r—i (pz—Pi*) — 2r Pr-i (Pr+l + PiVr) 

Since ^'=0 

=>• Var (m r )=-i- (p. ir —p r 9 +r* p 2 p 2 r-i— 2r p r -i i'r+i] [/^i=0J 
Similarly 

Cov (w r , (fl r+t prPs f Pr-l Ps +1 —S Ps-l p r +1 

+ rj Pr-l P>-1 Pi] 

Cov (m„ m/)=E [{(m /- n r )—r (w/ —/i,) Mr _,} (m/—jO] 

“Cov (m r \ m/)-r [i r -1 Cov (m r ', m/> 

C= ~ (.Pr+t PrPs) r Pr-\ — (Pi+s — plps) 

(Pr+t PrPs r Pr-l Ps+lt V Pi ' = 0 

Cov (rn x \ m r )=C ov (#w„ wjj') 

—— OVh —r ftp,.,) 

Now if r is even, /x r+1 and ^ r -i are moments of odd order and 
therefore, vanish for a symmetrical distribution. 

This implies Cov (m/, m,)=0. Thus if the population is 
symmetrical, x and m r are uncorrelated. 

Standard error of the coefficient of variation 
Coefficient of variation 

__ v _ 100\/(m 2 ) 
m x ' 

log r=log 100 +1 log m*—log mi' 



74 


Mathematical Statistics 

=> 8 (log K)=S [log 100 + } log m 2 —log m' 1 

8m t ' 

V mi 

Squaring both sides and taking expected values 
F (SF) 2 _ 1 i , 

v- -?SP E (S "' 2)2 + ^+ e (Sm,' 2 )-——; £• (6m„ 8m,') 


4/w 

£:(5F)2=^2 £ 


/ A 4-^2 2 , 

4 WMa 2 /I/*,'* 


-J^l 

FsfVJ 


m 2 m l 


Taking w r =/x r , 

=> E(wy=v 2 , /* 2 

L 4f|i a * /7A-3 

For the normal parent, ^=3**, /x 2 = CT 2 f ^ =0 Thus 

i w,4'| t+ ^],—s-A 




10 

Exercises 


2 ^ approx. 


1 


1. Show that — £ (x-x)* is an unbiased estimate of the 

ma^e l | ? s t g°Jen a by nCe a " d ‘ ha ‘ ‘ he Samplmg variance of this <*«>- 

1 /' n — 3 \ 

n \^ 4 /jUTi F** )* [M. A. Madras ’50] 

2. If ^ Xi and m 2 =~ 27 (*,— x)* are the sample 

mean and the sample variance, show that 

Cov (.r, m ,)= n -^} ^ 

where /i 8 is the third central moment of the parent population. 

Snow that if the populaiion is symmetrical. San dm, are 
uncorrelaied. [M. A. Eco Statistics Delhi 59] 

3. Show that 

E (m r ') = /i/, Var (m,’)=-L (^ 2r ^,^) 

where m/ is the rth sample moment about origin for a random 
sample x if x«,...x n from a population. [M, \ Madras TO] 

4. Show that j 2 (x—x * is an unbiased estimate of the 

population variance and that the sampling variance of this esti¬ 
mate is given by 


t -•) 


[M. A. Madras ’50] 



Theory of Sampling 


75 


5. Find the sampling variance of the coefficient of variation 
of a sample of size n from a normal population. 

How will you use these variances to test the significance of 
m t and the coefficient of variation ? 

[M. A. Statistics, Delhi*’ 60J 

6. M r is the rth central moment in a sample of size n. If 
n is large, show that the covariance between M r and M s is 

"TT P'-l / x i+l~ r p*-l Pr+i + W Pr—1 ^j-x] 


H-r being the rth central moment in the population. 

Show that in large samples from symmetrical distribution, 
the correlation between two consecutive central moments is zero. 

(M. A Bombay ’50] 

7. Prove that the standard error of the frequency// in the 
»th class in large samples of size n is approximately given by 

A finite population consists of N individuals with values 

x i> *3,..., * N . x is the mean of a simple random sample of size n 
drawn without replacement. 



1 £ i N 

N /=»I N /=! 


(Xi-H-f 


Show that E OO-i*. V 

Derive an unbiased estimate of o 2 . 

9. Show that the sample raw moment and the sample 
central monent converge in probability to the corresponding 

large 1 ^*' 00 momcn * s as S8m plc size becomes indefinitely 

10. If x is the mean of identical and independent variable 
x i* ■**»•••» x n with mean /a and variance a 2 , show that x converges 

,D Probability to /a. Show further that us n + co, j s in 

the limit, normally distributed. 

H. In samples of size n from an infinite population find the 
variance between the arithmetic mean and variance. 


[M. A Madras ’56] 

2. Show that the coefficient of correlation between the mean 



76 


Mathematical Statistics 


of a sample of n individuals and its square S. d. is given by 


! 


Vlh (« —1)-#»+3 

[B. A. (Hod’s) Delhi 62, M. A. Bombay ’54] 


Hint. Cov (x, 






Var (,v)= — 
n 

__Cov_(.y, ro 2 ) _ V'f r« _ 1) ] 
V^Var (a) Var (ro a ) v / l(«-l) 0S 2 -l)-f-2] 



13 

EXACT SAMPLING DISTRIBUTIONS ( t, Z & 
F ETC) AND TESTS OF SIGNIFICANCE 


13*1. Unbiased estimate of the variance of population. 

Suppose a sample x t , i= 1, 2, ...» n is drawn from a population 
with mean p and variance a 2 , then thn sampiiag distribution of the 
mean x has mean p and variance c 2 /n. Further , the unbiased estimate 
of the population variance is given by 

E (x t -xf ie. £-(i=)=<r>. 

Definition. The statistic t = t (x Jt x»,..., x n ) is called an unbiased 
estimate (or estimator) of the parameter 0, if, E ( t) — 0. This 
property merely states that the random variab'e / possesses a distri¬ 
bution whose mean is the parameter 0 being estimated. This 
property holds, for example, for t =x when estimating the mean 
p of a distribatoin For 

= (jfj-f x 2 -F ...-f x fl ) j 




Also, o 2 

x 


since x*s 
Thus 


(•*1 +.V2+ ••• -\-X n ) — p j 

^ l(*i + + (•*«— p)Y^y 

(*a — p) 2 . + E(x' n —p) 2 ] 

are independent and therefore 
E{x t -p) ( Xj -p)= 0 , i^j 

i° 2 +°-+. ..+< j ! ]=— 

* ri fi 


E(x-n)’ = E 


l n 

st= ~7T 2 ( r / —*) 2 = the sample variance. 
n /= 1 




78 


Mathematical Statistics 





=£ [Tf ( x '-ri 2 -(*-/*) 2 ] 

= 4" f /x)» 


™ ,s shows ls n °t an unbiased estimate of a*, which means 
that if repeated samples of size/, are taken and if the resulting 
sample variance are averaged, the average will not opproach the 
true in value but with be consistently too small by the factor of 
(//—l), r /,. For small samples this factor becomes important Conse¬ 
quently, one must be careful how he combines samples in making 
an estimate of the true variance when an unbiased estimate is 
desired. In order to over come the bias in it is nearly necessary 

to multiply 52 by /,/(/,— 1 ) and use the resulting as the estimate of 
a • Then 

£ trh ■ s ' 2 * 

Since y .27 (*,-*)•, 

it is clear that one can avoid the bias in estimating variance by 

dividing the sum of squares of deviations by (n- 1) rather than 

by n It is because of this property that some authors difine 
sample variance to be 

" (-v.-.x)* 

/=1 n ~ 1 

13*2. Fundamental Theorem of sampling. 

If x lt x t ... x n Is a sample from the distribution V (0, a*) and 
tf C— [oy] is an orthogonal transformation matrix producing new 



BlCaCt Smp - Dis "ibutions (t, Z&F etc) and Tests of Significance 79 
variables 


n 


& ~ 2 c h*j 

, 7=1 

burton N(0 % alS °' ^ ^ diStrihntion °f a Sam Pte from the distri 
Proof. We have £ = CX. so that 

S V-U-X'C'CX-X'X-r *,* 

since c =C~i from the property of orthogonal matrix and 

T , . CC=C-»C=I. 

iUe probability element for the sample x u x t ..., x n is 


dp = 


I 


(2tt c a ) ra /- 

Making the substitution 

- , ” 

=* * Oy Xj 

7=1 


2** z x » 

e dx t dx t ...dx n 




^ i~c7 since i C'c i =i 

= ±^1 dZ 2 —dZ ni 


We find: 


1 


- 27 . * * 


</€*...</& 


dp (2 tto 2 )'»/2 * _ . 

^ ,he5 ’ SiS,ha ‘ °f« sample from th e 
Ex- 1. Gonsider the orthogonal transformation 


5, 


6 


L-V2 


1 

1 “ 

V2 

V2 

1 

1 

V2 

y/2 _ 




^ ^ 1== v / 2 rAr,+x *^ (—^1+^2) 

di.«r!Iu“ ( S a.) e varfab“e d ‘™ independen '- each is 

ThU8 ^ 75 - is statistically independent of the difference 
bution PlC VaIUCS **“* 1=3 ‘ v/2 and each has a normal distri- 




80 


Mathematical Statistics 


Ex. 2. Consider the orthogonal transformation for a sample 
■*i> ^2» •••» %n from N (£, o*). 


lx ~ ' 

Is 

• . 

In — 

=> Sl = 


We have 


1 

1 

_ 1 _ 


1 


V" 

y/n 

v'/i' 


V 71 

-Vi 

1 

1 





V 2 

V2 

0 


0 

** 

1 

1 

2 


o 

II 


V 6 

V6 

V 6 


• 

• 

• 

1 

1 

1 


n -1 l| 


\/{n(n 

- 1 )} ^/{n(n- 

0{ Vi»(* 

- 1 )} - 


x n 

y/n x= 

•^l4-**-+AT n e 

/w * 

y/n 

= J2 


9 


ln=- 

1 r 

/{«(/!-!)} 

- 1 ) x n —(x 

■j-j-... 



£•(?,)= 

£-|*l+***+*»lj 


•4-^) 

=y/np 



£(5/)=0« ,==2 « 3 » 4,..., n. 

From the theorem £ x , Z 2 ....,Z„ are independent each normally 

distributed with vaiiance o a . Thus the sample mean x= —- is sta- 

v fn 

tistically independent of any difference among the x\. 

Also Zi*=Zx? y y-«< 


Then E a »+-.. + S„ 2 = 27 

i=l 

_ £ 

7Vjmj the sample mean x= is statistically independent of the 
sample variance 


S‘=~ 2 (A,—jc)»—■+...+?„») 

This is a remarkable property, it is true only for sampling 
from a normal distribution. 

13 3. Simultaneous Sampling Distribution of the Mean and 
variance in random samples drawn from a normals population. 

Let the sample from the normal population N (y, a 2 ) be .r lt 
x 2 , •••, x K . The sample variance is 

S 2 = — 27 (x t -x)2 or i.e. nS £ =Z (Xi- 'x) 9 
n i i 




Exact Smp. Distribution (t t Z & F etc \ t , r ^ 

* etc -> and Tes ts of Significance 81 


we show that is variable. 

We know that ± 2 (*,_*). = _L s X(2 _ J2 


27 X;=nS*-f-nx* 
i 


...( 1 ) 


^ ; consider the orthogonal linear transformation of variables 


I 





— ^rU 


C l2**‘- .C ln 
C 2I. c 2n 

r «. c, n 

C ”* . Cnn 


X l 

x % 

X 


i e. 5=CX 


where C is an orthogonal matrix implying CC 
Le ‘ Zc*u=l=2c* JI 

, ' j 

an 2 c, } c lk =0=2 c u c kl (j^k) 

Thus £ , £=X , C'CX = X'I n X=X / X 

n n 

*> 2 5 , 2 = £ x? 

'=1 i=l 


n 


=r 


...( 2 ) 


Let us choose c u = — 


l J * 




Cl2 
Vn 


V» 9 


• • • 


L £ 


21 


22 


C,„ = 

J_ 

y/n 

C 2n 


Vn 


Then 


• •• 


L 


• •• 


Cn 2 


• • • 


• •• 


c 


nn 


^ 1 Vn "'+ x n)—^-^'nx=>Vn x 


L. X n 


Hence nS ’= £ 


n 


n 


, \ y-5i*= £ C,> 

1 /=! /«=2 


US'* " 

& ^ Wo* 
i=2 


...( 3 ) 







82 


MathematU al Statistics 


If we choose the mean of the normal parent to be zero, then 
x~N (0, a 2 ) => (0, g 2 ) from (2) 

* '?= 1 — from (3) 

/=1 i = \ a ' 

=l\-i 

In other words 27 ; 


is distributed like X a with /i— 1 


i = l 


1 nS 2 

D. F. Consequently — a - is a Gamma variate with parameter 
A (n— 1) and its distribution is 


dp= 


l _ l'*L) 

n-i)) \2 g* ) 


__nS* 

nS 2 \ ~2o* J inS i 


m 


...(4) 


j*= 


\ 2 g 2 ) \2a 2 

The unbiased estimate of the variance of the parent popula¬ 
tion is given by 

Si 

=> n 9 2 = (n— 1) 5 2 =v5 2 [n —1 = v] 

the estimate s 2 being based on v D. F. 

The distribution of s 2 is therefore 


vs- 

2^2 


(Sf ir ~ 2) • 

= Tbfe) iV (") 1(V - 2) 




VS 2 \ 

2g*/ 

vs* 

2a 2 


ds 2 =f (s 2 ) ds *, say 


Suppose g- is unknown. When shall f (s 2 ) be maximum for 
the given value cf s 2 ? For this 

^ / C* 2 ) = 0 => d 2 «=s 2 . 

For this reason s 2 is called the optimum value of the popula¬ 
tion variance corresponding to the given sample. From (3) above, 
it is obvious that 

is independent of E a . 


x is independent of 5* 


n 


[for 5c, nS 2 = Z tf) 

i—2 

Sampling distribution of the mean and variance are 
independent. 



No. %~N{0, ,) . ijl^, <0 , „ 

*"'" («•?■)• 

Tb„ tap lta »,. i ^ ifc „ 

“ M lhe POPulaiioo. and with variance 

Alifer : Method of M. G. F 


a* 

n 


We have _£ (*,_* )2+n ^ 


...( 1 ) 


wh ere *„ at*,..., *„ is a random sample from JV (fx, CT 2 ). 

We know that 3c is AT /„ ° 2 \ 

S N —j so that V n (x-rilo is N (0, 1). 

Hence is y t 

Since * and £* and, hence, Qn . /7.S* 

caliv ° a ~ d ~~n*~ are statist!*- 

-P.e a r ™— 

Now£ f e M- " ^] from(I) 

, l_(x-y)‘ „S‘ 1 

That is, J 1 J 


( X 20“"/ , =(I — 2 /)- 1 /2 £ [ eg 


* E 


t nS * 

[ • 


' <* [ Re <=all ^X„«(/)=(i_20“" /2 ] 
_ 1 \ J 


-OLz*) 
2 


B j 2/ ^ »*<£• 

D. p U “ f ,^ ,S . the “• 8 ' f - 0f a chi-square distribution with n -1 

•> -- • * 72 

0 2 15 /• n-l. 

fr ° m a population witTrUnsity 0 '” 1 ’ 16 ° f " ,ndependem obervation, 

Showthn, 't*'" 1 ’ P>0 - 0 <*<1- 

ha, 'f° r n ”**«*> ><"**. * <* approximately d is,rl- 

# 



84 


Mathematical Statistics 


bated as a normal variate with mean —and variance 

H-l 


n (,« + !)• (#*+2V 


Exercises 


n 


Ex. 1. A linear transformation £,= Z CijXj, (/= 1, 2,..., n) 

j=l 

is said to be orthogonal if 

2 Cijc lk =8) k 

y*= 1 if j=k, 8/ t =0 if j^k. This is known as Kronecker 

delta] 

Show that for such a transformation 
n n 

(i) Z Z,'= Z x ,* 

i= 1 1 = 1 

(ii) If x u x n are independent standard normal variates, 
so are h, 5 2 ,... t 


1 


n 


Ex. 2. Show that — E (*/—/*)* is distributed as X s with n 

9 i=l 


1 n 

—- Z (x,—xy is distributed as X* with » —I 
a i = l 


D. F. Explain why 
D. F. 

Ex. 3. Write down the density function for 

1 n 

s% =~—r Z (x,-x)' 
n_l /'= 1 

and show that is is maximum when s 2 =a 2 . 

Ex. 4. Show that m. g. f. of the distribution of s * is 


2 , n— 1 9 , n *~zln t 

. Hence /V= ~^~ a » = 


and Var (j 2 )«-2 — <r 4 . Deduce for large sample 


Var (s 2 )=~. 

n 


Ex. 5. Prove the following result for a random sample 
Xu x n drawn from a population with finite variance e* : 



Exact Smp - Di3,ribMion It. Z & Fete.) and Tests of Significance 85 


I, /7<rt 


?/— S CijXj i 

u j==l 

Where S c lt c Jt = 6 ,j 

where 8i, is Kronecker delta, show that 

n n 

l i~ ^ J/a2 

^ distributed as X* wi,h'7n- P) degrees of free(Jom 

Ex 7 if r A - Delbl ’ 56 1 

each distributed ^'*** inde P enden ‘ random vnriables 

•hat 2 x , and 2 lx sis * *' ■ T 3 " ^ a " d variance one - sh o» 
xi and 2 (xi—x)‘ are independent. Find the mean and 

vauance of — S (*,-*)* and show that they approach 1 and 
" 35 " ' nCreaseS - [M. A. Aligarh '60] 

fandom s 8 amni?of nd mea " a " d ' be VariaDce of a 

•heformul^or" em r ° maninfini ' e P°P ula,i °". derive 
0) Var (*) 

(jjf ^ ar (-S' 2 ), for large n 
(iii) Var ( S ), for large n 

in 'argetmpies 5 *= 2 <»««. 

6 mp,es » a a °d b are uncorrelated. 

*. fM A - Pun i ab ’58; M. A. Patna ’56] 

distribution of the'mean InVamnlesof fUnCt, ® n of lhe sampling 
distribution. P f S,Ze n frora the rectangular 


Show that 


df '=dx, 

01=0. 0,-3 —£ 

Vn 

JM. A. Delhi *58, M. A. Bombay *5i] 


mint it u J * ue lhl ’ 58 , M. A. B 

I US ' ,he P;°P"'y characteristic functions 

^>*^ 8 (0...^ (/)=[** (r)]n 

ti * 



86 


Mathematical Statistics 


when <f>x (t)—E ( e ,tx )] 

Ex. 10 By using the characteristic function, obtain the 
distribution of the mean of normal and rectangular distributions, 
the rectangular distribution having the range 0 to 1. 

[I. C. A. Q. Delhi *48] 

13*4. Student’s t distribution 

Suppose x u x n is a simple sample from a normal popu¬ 
lation N (fi t a 2 ), then 


-2 


E (x—p)=0, V (x—fx)=—, where * is the sample mean. 


oly/n 


N (0, 1). 


If, however, in place of a we use the variable estimate s 
obtained from the sample by the formula 

s2 =,7^i * 


we have the statistic t: 

d - /=»->=*. say. 

which is not normally distributed. 

Jhe distribution of / was first found by the English statisti¬ 
cian W. S. Gosset, who published his research under the 
pseudenym ‘student’. 

From its definition 



Since nS 2 =£ ( Xl —x)*=(n—\) s 2 


= {*—nY/(a 2 /r) 

k Tis*j7* 


(l 

V [ 2 ’ 

Thus the probability that in 
will fall in the interval dt 2 is 




k=n— 1 


*\ 

~2 I variate. 

random sampling, the value of t J 



0<f 3 <oo 



Exaa Smp - “»"«•"*» (t, Z&F etc) and tests of significance 


87 


dF= 


or 


[recall (m, n) distribution is given by 

1 ym—l 

B (m, n) * (f dy* 0<><cc] 

dp<= - dt_ __ 

Vk B JL^ l + ~) (&TI77r# ~° c <'<°0 

the factor 2 disappearing, since the integral of rln 

range of variation of / must be unity The di«r ^.° Ver the w hoJe 

of as the , distribution corresponding *1 £ ,**»** 

Observe that although the value of t denemf gfeeS ° f freedom - 

does The essc g n(iaI fea,;; e oV, s“ b 0 :,;’ , i,S mt diS,ribu,i0D 

,nd ——- 

been u 0 b e u,a«e B d eC F“ °g £ ^7* it has 

satisfying the condition. ] * the Va,ue s of r k , « 

« 

| g*(t)dt= a , 


- oo 


[where g k (t) p.d. f. of r is given by 


r/fe + l \ 

gk U)= riFm^k)( >+t) 


/2 \-(*+!)/* 


-° c < ' < «) 
0-5 we can use tabu- 


are tabulated. For values satisfying 0 < « 
d values because of the distribution 

(2) J be graph or* (r) is symmeterica! In fact 
*he graph of the nortnal distribution. It is easy S’^ ^** 

I >1 IH J ^ - A 


have 


Lim , . 
*->oo gk(0= 


Lim 
k-> oc 


V(2v) 




«) 


Lim 




I 


c( l + 2 .A/ 2 ) ( , +'r) 1 ‘ t ' 2/2 


= a"'*'* 


r / ^~f 1 \ 

Lim _|_ _ 

V'wvA fk/2 


and 



88 


Mathematical Statistics 




Lim _ 

k-+CO y/lT y/k Ik — 


(t) 




Lim 


fc->OG Wit Wk 


i . 'W (vr^ tm (V) 


k-2 \ 


) 


V(2n) 2 

{Recall for large"/i, n ! ~ y/(2im) er n n") 

1 VI* 


1 


Lim _ 

k-*- oc y/k 


e -m 


1 k*‘* 


( 


1 


) 


y/2 k< k -m 2 


(■ 4 ) 


2 \(*-i>/* 


2 V‘* 


(i_L_\* /8 / i_L\ 

Lim ± e _ m 1 l 2.k/2 ) \ k I 

k-+cc Wv V2 


(-AT 


1 e -iii £- 1/8 


I 


V(2”) e -1 \/(2 tt) • 

Alifer — Direct derivation of the ‘t-distribution.* 

Suppose that the random variables w and v are independent 
and have distributions N(0, 1) and X 2 * respectively. 


Define / 


w 


W/k) 

The joint probalility element is 

dn= _L £-^*/2) 

P W(2 rr) 

Let us change variables to 

ir 


\ 2 ) -IX 1 


2 TA-/2 


J»v dX 2 


t= 


VWIk) 

rfH ' ax, =TO ) * rf ” 

7t 


, v=x*, — CO < f < 00, 0 < v < oc 


t 


2 v(v/c) 


0 


l 


t/r dv= W(y/k) dt dv 



Exact Smp. Disttibution (/, Z&F etc) and tests of significance 89 


y/(v/k) dv dt 


Hence the joint distribution of t and v is 

I /vN-2- 1 -^( 1+ T) 

2V(2n) r(k/ 2) V 2 ) e 

To find the marginal distribution of /, we integrate out v 

■ . r ¥ (■♦*■) 


gk ( 0 = 


2\Z(2n)r(k/2)\/k 2*I*- 1 J v 

0 


M 


dv 


m 


2^V(27r)r(k/2)y/k ^ ^ t+liy 


(*+!)/! • 


[ 


oc 


recall 


£>- f 

a" J 


— a y n — 
e 7 y 




• - ' <.] 


0F * n(/)= r(k/2)y/{r,k) ( 1 + k ) 


-*(*+!) 


gk(0 is the p.d.f. of the /—distribution with A' degrees of 
freedom. 

Applications. 


The appl cation of this distribution to sampling theory is 
immediate. Suppose we have a random sample x x , x a ,.... x n from 
the noimal distribution N (/x, c 2 ). 

Then W =7f$n~ N ^ » 


n 

X f = Z 
i= 1 


—with w—1 d. f. 


t 


hence 

where 


w _(x -/OV” 

1 )> * 

**=—n- z(x t -xy 

n-\ I 


is therefore distributed according to / distribution with (//- I) </_/ 
The student distribution' may also be used in connection with 
two samples. Let X/ (/=1, 2, n x ) be the values of the variable 
in the first sample from /V(/x, a ? ) and x‘> f j—\ t 2, .... n 2 ) those in 
the second from a*). Let x, snd x , be the sample means and 



90 


Mathematical Statistics 



*%-! f (*' x ' )l 

Then 

_ ( x i x a (^“/O 


J {° 2 ('V + ^r)f 

and 

i) s 

o* 

Hence the ratio 


N(0, 1 ) 


«= (Xl — X a ) — (iL — n') 

V V+K-l)i* 2 ) 

is distributed according to * t,\ 

*» l +i, a -2 W 

In case 


V(n t — n 2 — 2) 

Air + «f) 


/ —_ (-V* —-v 2 ) v/f^+z/jj—2) 

\/{(^-I) V-f*(M a -J)V}^J. + J_| * 

Chief Features of the t-disfribution. 2 

(0 The probability curve of the f-distribution is 

1 l 




v *M*-r) ('+xr +,) 

('0 It is symmetrical about the line / = 0 
/,n\ I 

1,1 , y^T decreases rapidly as | t | increses. 

( 1+ t) 

As I f I ->oc, y->0, i c. the curve is asymptotic to 
at each end. 

(viii) y is maximum when /=0. 


curve is asymptotic to the t -axis 


(V) P (t > /„)= 2 f_ L__ . d‘ 

l^‘ (t f) 

(VI) Because of symmetry ^=0 and ^ r+1 = 0 

Aiso /‘ £f +i=0 


Exact Smp Disttibution (t, Z&F etc) and tests of significance 


91 


oo 


Hr 


- 


1 


t ir 


(< + 0' k+ ‘‘ 


oo 


= 2 \ 



( t*y 

* J 

0 

Vk (i. 

k \ 

2 i 


1 2 

Pnt 1 4 --_ 

_1 

#2 _ 

k l ~ y 

rut 1 + k 

y 

l — 

y 

It dt 

k 

- —dy 
v* 




Hence 


l l 2r — 




(*■ t)o 


B 


(»• S-) 


5 (Afc—r, r-f-A) 


nn/2 L (,„+«) 

„ * r ('—?)(>-^ rak -r) 

MM 

_^ (2r-l)(2r~?)...3.1 __ 

(A:— 2) (A;——2r) 


In particular 

k _ 3£a 

H 


P» = ~i=0 and p 2 . 

H /i 2 


— = 3 ( A _=_2\ 

fV ye —4/ 


P 2 -* 3 as k -*■ oc 

distribution^ 01 ” ‘- dis,ribu ‘'°" --«•* «» the nor,nal 

The recursion formula for r distribution is given by 

„ _ * (>— I) 

-(F^) 

The reduced variate is—,,-.—_, 

V{kl(k- )} 



92 


Mathematical Statistics 


Ex. Express the constants jv» a and m of the distribution 

dF(x)=y 0 (l—x*a~ i ) m dx, -a <x < a 
interms of Its a 2 and p a , 

Show that if x is related to a variable t by the equaiion 

at _ 

{2.m-\-2-\-t i )*t* 

then t has student's distribution with 2 {m+\) degrtes af freedom. 
Use the distribution to calculate the probability that />2. where the 
degrees of freedom are 2, and also when 4. 

[B. (Hod’s) Delhi *62, M. A. Patna ’64] 

We have y 0 J ^ 1—*— j dx = 1 
Put x=a sin 6, so that dx=a cos 6 dd 


2y ° L a 


cos lm+l 8 dd= 1 


=► 2ay 0 -\B [m+1, A] = l 

^recall * cos'd sin*0 de=\B^~- 


^ )\) — 


1 


a B (m+l, 4) 


M', = £ (x)=y 0 f X | dx= 0, 


the integrand being an odd function 


The substitution x — a sin 8 yields 

„ f»/* 

p v = 2y 0 a 2r+1 cos 4 " 14 * 1 # sin'- p 0 </0 

J 0 

= *>'o a 2r+l • J fi (m+I, 

= j B (m + \, J) 

In particular 

o 2 = p. i = a 2 B (m -f 1, 3/2)/B (w-f-1, |) 



— a* 


\ 


= a i (2m + 3) 


IU = 


m + 3/2 

[recall * 

(2m+3) o 2 

^(m+l, 5/2)/A(m+l, 


—s-1 

P p 9 J 


...( 2 ) 


Exact Snip. Disttibution (t, Z&F etc) and tens of si*nifican 

.« E(m+U r5/2 rCm + 3 / 2 ) 

~f\m+7/Z) T(m+ 1) TTl 


ce 


93 


=a 


u n 

(m + 5l2Xm + 3/2 n 
3a 4 


« __ 3(2w4-3) 

(-'/!•+5) 


2 2 

/V 


/M = 


9-5/9, 

2 t/*,-3) 


or 


Now , where fc=2 (□. + !) 

2 1 


** = 


Vk'(\+i*lk)'l* 


dt. 


and 


x 2 I 

~^ = T^TTTk) 

Tkis subtitution leads to 

OC 1 


(0=2 y, f 


a 


1 


y/k *(*/2 


o (i +/7*) m V*(H-'W 3/2 
1 __f°° dt 

. 4) 


dt 


oo (1 + i' > ik) l,ii 

[recall B (m, n) = B(n. m)J 
2m-f-2^* S ,S slu ^ enl s distribution with degrees of freedom = £ i e. 


P{t> 2) with k—2 

OO 


j 


1 


dt 


i 


*/2 


a V 2 *(l, *) (l+i*/2)»? 

[substitute t = y/2 tan 0 => */= ^2 sec 3 0 *6] 
_^3/2 y /2 sec 3 0 d O 

tan“V2 v/2 ri ^ 2 iec " V ~ 

L ./*>/. /-> V v'3/ 2 


P (l > 2) with £ — 4 

OO . 


'sin-* -V/2/-/3 


\/6 
6 


-i 


dt 


« 2*12.4) (I +/*/,;«/» 

[substitute /=2 tan 0 => dt = 2 scc 2 0 *01 
= (*'* _^5/2_ 2 sec 2 0 dd 

J*/, 2.ri n/2 ■ see 6 (/ 

= ! J* /2 cos a 0 * 0=5 J’ /2 ^ cos 3 *+ 3 c °s 0 
V2 


) 


*0 



94 


Mathematical Statistics 


Table 3. Values of mod. t with a probability 
of being exceeded in random sampling 
v=number of degrees of freedom 


V p 

a 

0 50 

0 10 

0 05 

U 02 

U 01 

1 1 

1 000 

6 34 

12 71 

31 82 

63 6b 

2 

0 816 

2 92 

4 30 

6 96 

9 92 

3 

0 765 

2-35 

3-18 

4 54 

5 84 

4 1 

0 741 

€ 13 

2 78 

3 75 

*60 

5 j 

0 727 

2 02 

257 

3 30 

4 UJ 

6 1 

0 718 

1-94 

2 45 

3 14 

3 7 1 

7 ! 

0 711 

1 90 

2 30 

3 00 

3-50 

8 ' 

0 706 

1 SO 

2 31 

2 90 

3 36 

9 

0 703 

1 S3 

2 26 

2 82 

3 25 

10 j 

0 700 

1*81 

2 23 

2 70 

317 

1 1 

0 607 

1*80 

2 20 

2 72 

3 1 I 

12 

0 095 

1-78 

2 18 

2 68 

3 0b 

13 j 

0 004 

1*77 

2 16 

2 65 

301 

14 

0 692 

1 76 

2 14 

2 62 

2 98 

15 | 

_ i 

0 691 

1*75 

2 U 

2 60 

2 95 

16. 

0 090 

1 75 

2 12 

2 58 

2 92 

17 j 

0 039 

1 74 

2 II 

2-57 

2 90 

IS | 

0 098 

l 73 

2 10 

2 55 

2 *68 

19 1 

0 088 

1 73 

2 09 

2 54 

2 86 

29 | 

0 087 

1 72 

2 09 

2 53 

2-84 

21 

M \ 1 

0-GSG 

1-72 

2-0S 

2 52 

2 83 

*> *» 

0-688 

1-72 

2 07 

2 51 

2 8 * 

23 J 

0 095 

1-71 

2 07 

2 50 

2 81 

24 I 

0 GR5 

1 71 

2 06 

2 49 

r-go 

25 j 
* _ . « 

00 91 

1 71 

2 u6 

2 49 

2 79 

20 

0-6 

1 71 

2-00 

2 48 

2 73 

27 

0 084 

1 70 

2 05 

2 47 

2 77 

28 1 

0 633 

1 70 

2-05 

2*47 

2 76 

29 

0 683 

r-7o 

2 04 

2 40 

2-76 

30 j 

0 683 

1-70 

2 04 

2-46 

2 75 

35 

M /*\ V 

0 6S2 

1-09 

2 03 

2-44 

2 72 

40 

0 661 

l-GS 

2 02 

2 42 

2 71 

45 

0 080 

1.08 

202 

2 41 

269 

60 

0-679 

1-68 

201 

2-40 

2 68 

60 I 

0-678 

1 67 

2 00 

2-39 

2 60 

CO 

0 074 

1-04 

1-90 

2-33 

2-58 j 


Reproduced by permission of the author, Professor R. A Fisher, 

from his book on 

Statistical Methods for Research Worker . 



Exact Sm P Distribution (t. ZdF etc ) and tests of significance 95 

13'5. Test of Significance : t-test 

Test for an assumed population mean. 

Suppose we have the random sample values ,v„ . x„. We 

have to test whether the sample has been drawn from a normal 

£' J° n " lth m f n «*’ we formulate the null hypothesis that the 

known Z T" tak , en fr ° m ,he population N (p, «.) in which ,, is 
known and c*js unknown. We calculate P 


i 

s/Vn 


t 0 . say where 


S=sample mean, Z (*,-*)» sueh that £(*■>)=„* 

The statistic ,JZ=£hi*_ i, based on (n-l) degrees of free- 

S; Aharihe 16 f ? °T 3 raDge ° f Va " ,CS 0, d - f - thc Proba- 

random samphng from a norma, poputatioa wtth if 

cant. If /.r° | * “ * h “ °' 05 w ‘ re 8 ard value of r as signifi- 
significant 0> , SS lhan °' 01 - the value of , ls h.ghly 

the hypothecsh« ", t t h V * ° f ' ,hr ° WS d ° Ubt 03 thC lrU,h 0f 
yp mesis that p is the mean ol the population. 

can also ^, aklD8 l ^ e concIusion in regard to the hypothesis we 

'.:t va,ue °rr h 

a3d if '• > Vo 4 the hypothesis' is un^,^ P ° ,heS,S * 

'eJVdJion PaUe " ,S ' Wh ° m a Cer,ain drink ” as odimmistered 
4 -J s t ' ef °l lomn ^ increments in blood pressure (B. P.) 7, 5, _/ 

mine whether tfu>\ L Com P ute the s “*ti>ttc you would use to deter- ’ 

these incrZ t indlcate that drink was rssponsible for 
increments and indicate how you would pro ce ed fur Our. 

F°r the given sample ' 56 ' “ SC ' (H ° D ’ S) De,hi ' 64 > 

ana e • t = 9 . d.f. = 9-l = g, s= v '(| 5 -75 = 3-97 
t erefore, on the hypothesis that p= 0, we have 

I f I = I (x—p | y/ n/s X ^ 


= 1 51. 


this valuA/ 1 ; Win'Ll find , , !; at * f ° r k=K the Probability tha, 
ot t will be exceeded numerically in random samphng is 



96 


Mathematical Statistics 


about 017. The value is therefore not at all significant, and the 
test provides no evidence against the assumption of a population 
mean of zero. 

Ex. 2. Ten individuals are chaosen at random from a normal 
population and their heights are found to be 65, 65, 66, 67, 68, 69 
70, 70, 71.71 i. ches Test if the sample belongs to the population 
in which the me an height is 66 inches. Given 

(0 t 05 = 2-62 for k=9 

or (//) for t=l 8, P= 947, t=l 9, P= 955, k=9, 
where P is the area to the left oj the ordinate at t. 

]M. Sc. Agra ’68; B. Sc. (Hon’s) Delhi ’65] 

For calculating x and s, consider deviations from the assumed 

origin 68. Thus 

Total 

AT=x-63 : -5 -5 -2 -1 0 1 2 2 3 3j —2 

r-: 25 25 4 1 0 1 4 4 9 9 ' 82 


Hence x=JT+68 = — +68=67 8 

2 {x i -x) l =2X i i -nx*=% 2-10 (-* 0 )*=81 6 

5 2 =2 ^ Xi -x)*=^~ =9 0666 => s=3 Oil inches 


Now 


and 


X — /JL 67 8 — 66 
s/y/n 3 0011 /v/TO 


= 1 89 approx. 


d.f = 10—1 = 9. 


For 9 d. f, we get from the table, t 0 06 = 2 62 
Since 1’89 < 2 62, therefore our value I 89 is not significant 
and the data does not provide any significant evidence against 
the hypothesis that the population means is 66 inches. 

By interpolation for t= I 89, P =’9542 

Hence P [ | / | >1 89] = 2 (I — 9542’ = ‘09I6 i.e. greater than 
•05 and so the value of t is not significant, meaning there by that 
the hypothesis is not to be rejected 

13 51. Fiducial (confidence) limits for the population mean. 
Suppose a sample of size n from a normal population has 
mean x and has the unbiased estimate s 2 of the population variance 
based on vdf Then 

95% confidence range for p is given by 

I x-p I y/n 

s < '*• 

where the value t t corresponds to ^=0 05. 



Enact Strip. Distribution (/, Z & F etc ) and tests of significance 97 

Similarly we have confidence ranges and limits corresponding 
to other levels of significance. 

Ex. A random sample of nine from the men of a large city 
gave a mean height of 68 in.; and the unbiased estimate s z of the 
population vartance found from the sample was 4 5 in 2 . 

Find the 95% and 98% fiducial limitt for /A . 

Here v = 8 , / 1= 2*31, s=2- 12. n= 9. Hence 

st i 


y/n 


= (2*12) (2-31)/3 = l-63 


The required 95% limits are 

•*±%=68±l-63, 

V n 

that is 66*37 and 69 63 in. 

From the t-table, r,=2‘90 is the value of t which is exceeded 
numerically with a probability of 2%. Then 

that is 65*95 and 70 05 in. 

13 6 . Two Samples : Significance of the difference of the means. 

We ask, Does the difference between the two observed means 
indicate any real difference between the samples ? What is the 
probability that the observed difference may be explained by errors 
of random sampling, so that if this probability was small, we will 
be compelled to postulate an inherent difference in the population 
from which the samples w r ere derived. 

Let the samples be 


x„ x 2 , 


i* 


* 2 , 


, Xrti 

» Xn 2 


and 

of size n t and n 2 

We have to see whether the two samples may be regarded as 
rawn from the same normal population or from different rormal 

Populations. 

8 d k 01 * 1 the samples are from the same population with 


we have i 1= -L " V 5 J_ ”£ x . 

n M=l n J = I ' 

S. D. of x l is oqual to 
S. D. of x t 


V"i 

»f o/V n z 





98 


Mathematical Statistics 


S. D. for the difference Xx—x„ = /(—-f- —^=<y /( _L-f _L\ 

, “ a/ \"i n t I V [ til „ 2 ) 

nut a is not known and we form an estimate of it from the data. 

Since we have supposed both sets of observations derived from the 

same population, we may regard the total rii+n, observations as 

one somple and form an unbiased estimate j of a from this com¬ 
bined set. In fact, we take 

* a =p(*/-*») a +27 1 +*-1) 

Then the S.D. of jfj—jf a =j 111-1) 

V \'h n, J 

and we want the probability of 


^ ( I * I > U where r 0 = 


x, —x. 


j( 


... . with d./. v = /j.4-n, —2 

II the probability P ( | , , > ,„) is greater lhan 5%< we C0Q . 

C u e the at such a value of t may be due to fluctuations of ran¬ 
dom sampling and we need not necessarily suppose anv real diff- 
rence ,n the populations from which the samples wre obtained. 
Note. 1. For large samples, the distribution of the statistic 

5 V{(l/fl|) + (!//»•)} 

is nearly normal and the P [ | r| > 2J-0 05 approx. 

Note. 2. VVe may test the hypothhsis that the samples were 

drawn from d.fTerent normal populations, with means and p 

espectively, but the same variance. In this case 

X x —iL~N (0 t a 2 /Hi), x,-^ ~ JV (0, a 2 /n n ) 

so that 


~ N 1^0, a 2 + 

Hence /= (n-^ ) 

•V{ (I/»,)+(!/»,)} 

conforms to the t distribution for 2 D. F. 

Note. 3. VVe have 

11 = (Vi -xy/n* {(1 //7,)4-t 1 In :} 

[£ (.V i7(.v'y—^ 2 ) a ]/a 2 ] 

1 j 

= * 1 * „ 

ys~ — "2 (§• b')> v=r7 1 -b« 2 —2 

* «i + w 2 —2 

so that /*/v conforms to the / distribution for v D F. 



ExaCt Sm P' Distribution (t, Z&F etc) and tests of significance 


99 


Thus we have the theorem : 

A statistic t conforms to the t distribution for v D. F. if 

/2 X 2 

=.- L , that is, Xi 2 and X a 

X. v 


— co < / < cc, and 


V 

ore independent variates, which are distributed like X 2 with l and 
v D.F respectively. 

Ex For a random sample of 10 pigs fed on diet A , the increases 
in weight in pounds in a certain period were 

10, 6, 16, 17. 13, 12, 8, 14, 15, 9 lbs 
For another random sample of 12 pigs, fed on diet B, the in¬ 
creases in the same period were 

7, 13, 22, 15, 12, 14, 18, 8, 21, 23, 10, 17 lbs. 

wfletfier diets A and B differ significantly as regard the 

thp tw n lncreases in wei ght (or tesl whether the mean increases in 

that v/ Sam , P CS r °/ e si Z n ‘f tcant iy different). You may use the fact 
5 / q value of 6 for 20 d-f. is 2 09. 

[Agra ’63, B A. Hod’s Delhi ’71, I A.S. 1950] 
_ rom the data we have 

?= I2 ’. iS =' 5 - " 1 ' S '‘ 2 = 120 - »,S , .*=314.n 1 =10, «„=I2. 

e estimate s 2 of the population variance is 
c ,^ J20 + 3l4 _434 

10 + 12-2 20 7 
* 5=^(21*7)=4*66 

c ' a * ue °f t from the sample is then 

-j fl_ 1 x -1-317,d.f.»20 

Vll0 + 12 


\t\ = 


. 4m v(5» + ; 2 ) 

x -."■»' 

exercises 

8 amplc Jf g d ta t ^g tUdCS 1 f ° r the thc following variates in a 

2 A n+rt • 2 * 2, 3, 3. [Agra ’48/ 

resulted in , J?, St ! mu,us administered to each of 12 patients 
d ln th « follow,ng increases of blood pressures : 

Can it hp ^ 1 8 ‘ 3 ’ °» 6 > -~ 2 > 1» 5 » 0, 4 

Panied by a n°f n ^ Cd the stimulus will be. in general, accom- 

icrease in blood pressure given that 

/ —i r. « - . 


05 


2*201 for 11 d f. 


[M. Sc. Delhi ’57] 



100 


Mathematical Statistics 


3. The nine items of a sample had the following values : 

45, 47, 50, 52, 48, 47, 49, 53, 51. 

Does the sample-mean differ significantly from the assumed 
population mean of 47*5. Given that 

P=*945 for /= 1 *8; P=*953 for /= 1 *9; d f.=8 

[M. Sc. Agra ’58] 

4 A machinist is making engine parts with axle diameters 
of 0 700 inch. A random sample of 10 parts shows a mean 
diameter of 0 742 inch with a S. D. of 0*040 inch. Compute the 
statistic you would use to test whether the work is meeting the 
specification. Also state how you would proceed further. 

[B. A. Delhi ’68] 

5. A group of 10 children were tested to find out how many 
digits they could repeat from memory after hearing them once. 
They were given practice at this test during the next week and 
were then retested. Is the difference between the performance of 
the ten children at the two tests significant ? 

Roll No. 1 2 3 4 5 6 7 8 9 10 

Test 1:6547867568 
Test 2:776796866 10 

Hint, 2 ,y= lo, 2 d*= 16, d = I, s'= — 2 (d-d)- 

L n — 1 

= 5=06 

so that 5=0*8169, t= d ^ n =j vX*0) _3 g7 

5 0**165~ 

For v=9 ’ ' ( ooi) =3 ' 25] 

6. There are 10 students who were examined without tuition 

and got some respective marks. They were given coaching for 
full month and then examined and they got the marks. We have 
to find whether the coaching has done any good or not. 

Students : A x A s A 3 A 4 A 5 A 0 A 7 A 8 A 0 A l0 

X :20 18 19 22 17 20 19 16 21 19 

y : 22 19 17 18 21 23 19 20 22 20 

Given R[| t |>1*25] = *2?6 for v=9 d. f. 

7. A random sample of 10 boys had the following I. Q’s, 

70. 120, 110, 101, 88, 83, 95, 98, 107, 106. 

Do these support the assumption of a population mean I. Q. 
of CO ? Find a reasonable range in which most of the mean 
I. Q. values of samples of 10 boys lie. [B. Sc. Madras ’62] 



EXaC ‘ ^ DiMon ft * <* F etc.) and tests of significance ,01 
[Hint * r 0 0 S = 2 ' 262 for9d f - 


x±t. 


= 97*2 + 2*262 x 4*514 


0*05 V" 
i. e. 107 41 and 86*99] 

8 . A random sample of 16 value* 
showed a mean of 41-5 inches and a sum of squares^ ? e P !" a . ,ion 
from this mean equals to 135 square inches Show .^ ,al,0ns 
assumption of a mean of 3 5 inches for the nn i • * hat fhe 

reasonable. Obtain 95 and 99 per cent fiducial iLfts f'rThe'sa 001 

,obJr may use the fo,,owiDg 

v=l5 J ^=005^=2*131 

9 The rod , ' r , />=0 ' 01, ' = 2 ‘ 947 t B - Sc - Bombay '681 

.96-40 ami 1 ^98^82 S respMt!veIy°^ 0 The Sa s™m e of 0 the ZeS ’ ? “ 

deviations from the means are 26 94 and 18 74 SqUareS ° f the 
•he sample be considered to have been drawnC^' ^ 
normal population ? You may use that the Same 

*.05=2*145 for v=14 

*01=2*977 for v=14 

ssxzz™ ° r ,rrr 

~ a i xt Type 1 Type 11 

Sample No. n x =.8 n _ 7 

Sample means *,= 1,234 hrs. * a Il,036 hrs. 

Sample S. D. £,=36 hrs. s a =40 hrs 

Is the difference in the means sufficient to warrant th*n r 
suppenor to type 11 regarding length of life ? [ B . Sc. Agra ’63] 

tH "*- , 0 05 r=2 ' I60for V = 13 J 

foods A and B, the following resuhs^of^crease^n ° f P ' S 

observed in pigs : * increase in weights were 

K : 49 53 51 52 47 50 ? 8 

2“ f •• « 33 « 53 50 54 54 5 j 

a r , ind — 



102 


Mathematical Statistics 


(b) Also examine the case when the same set of eight pigs 
were used in both the foods. IB. Sc. Hon’s Delhi ’55] 

[Hint, (a) /=2*17 ? ' 0 . 05 = 2 *145 for v=l4 


(b) /«4*32, r 0 ()5 =2-365 for v=7] 


12. Eight pots growing three wheat plants each were exposed 

to a high tension discharge while similar pots were enclosed in 

an earthen wire case. The number of tillers in each pot were as 
follows : 

Caged 17 26 18 25 27 28 26 23 17 

Electrified 16 16 22 16 21 18 15 20 

Discuss whether electrification exercises any real effect on 
tillering. [I. c. A. R. ’56] 

13. A coin is thrown 12 times and the number of heads 
noted. The experiment is made 12 times. The number of heads 
thrown are 2, 2, 3, 3, 3, 4, 4, 5, 6, 7, 8, 10. It is likely that the 
coin is biased ? 

[use 'o^ 1 ' 80 ’ for V=1J ] 

14. The means of two random samples of sizes 9 and 7 
respectively, are 196-40 and 198*o2 respectively. The sum of the 
squares of thi; deviation from the means are 26*94 and 18*73 res¬ 
pectively. Can the samples be considered to have been drawn 
from the same normal population, it being given that the value of 

t for 14. d. f. at 5% level of significance is 2*145 and at 1% level 
of significance is 2*977. [B. Sc. Lucknow ’60] 

15. Obtain tests of significance for testing the following 
hypothesis : 


(a) a„ is the hypothetical value of o in a normal population 
N (n, o 2 ) with known /j. 


(r ) 


(b) o 0 is the hypothetical value of c in a normal population 
N (^, a 2 ) with unknown ^ 

(* ) 

(c) / a 1 =/i 2 in two normal populations N (^, af) add N (u 2 , 
g.J 1 ) where and are unknown but a* 2 and o 2 a are known. 

\ 7 - iV(0 - 1) ) 

V Vh w, / 

[M A. Patna ’57, B. A. Hon’s Delhi ’581 



EXaCt Spm ' Dislrlb “‘ion 0, Z & F etc.) and tests of significance 103 

sas z srsKasssr.- 

( x t* yd, /= 1, 2,..., w. 

Sample variance of x and y are 

nSf=Z (x t -x)\ nSf=Z (y t ~y) 2 

* t 

Correlation in the sample is 

E (*t-x) (y t -y) 


r= 


TL , 

ihe n values y t are independent, 
yielding «h;^ , n ey;adatest Cted ‘° “ 0r,h ° 80nal 


We take 


7]l ~~^/~n E y*=V(.ny) 


since the sum of the squares of the coefficients of y t i s unity. 

/= 5| 7) ‘* == r f j yt * =: t £ l (yt-y) 2 +n?*=nSS+' ni 2 


n 


* nS t 2 = z Vt * 
t =2 


— = A 2 „_1 Variate. 


a. 


Further from the definition of r 

r S z z=>Z (. x t —x)(y t —y)/\/n S t 

(x t —x) y^n Sl since y Z (* t —.*)=() 

t 

for p ==T)2 

°*>i is unity ) . V ”‘ Sl2=1, that is SUm of the squares of lhe coefficients 


V= 


Now iiS a *« ^ 




=> 2n t *=nS % 2 (1 —r 2 ) 

9 




o* are independent standard normal 


variates. 



104 


Mathematical Statistics 


and 


Thus r 2 — 


nr*S t z „ „ 

-—~ =X, 2 variate 
c 2 2 

v * 

nr 2 S 9 2 


’2 _ 


m 2 Sflai 


_ *1* 


nSr /ir^ nS, 2 (!-;•*) Xi 2 +XV 8 

<To 2 (J 2 2 


or 


=P, {], * (n—2)} variate [See Ex. 1 of § 8*13) 

Hence 

Now E(,i)= B& I'(.- '2)1 r3 (‘ -r>r-»l‘dr 

= B (i, i (n-2)} putting 

_ j (t’ ~r) _ i r /•«/•*.] 

r I «-il ( ,,n)= />i+S"J 


2 _ 


r-=/ 


Also FM-P /-(l-r^n-4) 

* (r) ~L*7U<"- 

Hence F (r)=£ (r 3 )-£2 ( r ), 


lx- ¥) 

1 r(l-f 2 )(n- 4)/2 

TO 


0 


1 


W-I 


a (r) = 


. This is the S. E. of r. 


V(n-l) 

Now r 3 ~ft [£, * (n-2)] 

r 2 /3 

^ T=7*~ P * f ? > ("~ 2 )I- Also —-& (J, 1 (n_2)) 

[Recall rfn.^ ^VO du 

L ' TuTTri »°<«<i 


is transformed into 


dn— < /n' 

TO", m)(l+»,•)«+«•* °<^<co 

by the substitution r w»=, that is if u a * 

1-a * 1031 >s, ii u is a ft (/, W ) variate 

thCn FwT is a v ariate] 



Exact Smp. Distribution (/, Z <& F etc.) and tests of significance 105 


We define the statistic t by 

* 2 _ r% t _ r V(n—2 ) 
n-2 1 -r* VO-'' 2 )* 

Testing the significance of an observed correlation r in the 
sample by t test. 

Suppose r is the correlation coefficient obtained by a random 
sample of n pairs of values from a bivariate normal population. 
We can use the fact that 

i/V(n-l)~ Ar(0, 

for large n t but for result n it is not normally distributed. 

We have seen that r 2 in random samples of n pairs from an 
uncorrelated bivariate normal population is a fa (£, £ (n — 2)) vari- 

ate. This implies that —- is a fa ($, \ («—2)) variate. 

Hence we define the statistic t as 

•=v^) V( ”- 2) 

1 2 r z 

Observe that w — 2 - = is a fa (1, * {n-2)) variate. 

Hence the statistic t conforms to the t distribution for n—2 

d. f. 

For testing the significance of r we assume that the variables 
in the bivariate normal population from which the n pairs of 
values are taken are uncorrelatcd. We calculate t from the sample 
and decide from the / table whether the value obtained is a rare 
one. If it is, then assumption does not hold good and we con¬ 
clude that probably the variables in the population are correlated. 

Exercises 

Ex. 1. A random sample of 12 pairs of observations from 
a normal population gives the coefficient of correlation of 0 45. 

Is this significant ? 


mint /- ° , 4 V( I0 >, 

1 U-(o-45 yyi 

For I0d.f., 0()5 =2 23 


1*59, d f = 10 
r is not significant 


at 5% level of 


significance] 

_ Ex. 2. , a 2 ,..., x n is a sample from a rormal population. 

x and S 2 are the sample mean and the sum of the squares of the 
derivations from the mean respectively. If x' is one more obser¬ 
vation independent of x„ x 2 ,.... prove that 



106 


Mathematical Statistics 


x'—x If n (n— 1) \ 

V ** 2 J\ n+l ) 

has the student’s /-distribution with n—l d. f. 

Ex. 3. ( x it yt), i=l, 2, 3,..., n is a random sample from a 
bivariate normal population with correlation coefficient zero. 
Derive the sampling distribution of the sample correlation coeffi¬ 
cient r; show that ^ 7 — ^ - ^ \/(n—2 ) is distributed as student’s / 

and indicate how you would proceed to test the significance of the 
observed correlation r. 

Ex. 4 x and S 2 are the mean and the variance of a random 

sample of size n from a normal population with mean p and 
variance a 2 . Show that 

(i) x and S 2 are independent variables. 
nS* 

(ii) follows a X 2 distribution with (n—l) d. f. Derive 
its mean and variance. 

Hence indicate how you would test the hypothesis 

Hq 1 o 2 =ct 0 s . 

Ex. 5. x and y are two independent variables, x is a X* 
variate with n d. f., y is a standard normal variate. 

Find the p. d. f. of [M. Sc. Agra ’62] 

Find V (/) and obtain the approximate distribution of / for 
large n. 

•*■») and (>’„ j’ n ) be independent random 

s mples from a normal population with mean zero and variance 

° • Let their means be * and y and their variances SJ and 5 1 
Let the pooled variance S p - be defined by * 

Sy!= SV-Hn-1) S u * 

m+n-2 * 

Prove that a—J’ and(m-f-«—2)*—- are independently distri¬ 
buted, the former as a normal variate with zero mean and 

a2 variance and the latter as a X a variate with 

m-fn —2 d. f. 

Ex. 7. Obtain the sampling distribution of student’s / in 
random samples of wze n from a normal population. Work out 
the test procedure for testing the hypothesis ^-u^c where c i, 

some specified constant. 1 /a 6 c ,s 


n(n- 1) \ 
n+1 I 


E x act Smp. Distributions (/, Z&F etc ) and Tests of Significance 107 


Ex. 8. Derive the distribution of the ratio u/v where u is a 
standard normal variate and v (>0) is such that v* is distributed 
as X 2 with m d. f. (u and v being independent). 

Show how this ratio is useful in testing the significance of the 
difference between two sample means under certain conditions to 
be carefully stated. 


Ex 9 What is Student’s /-statistic ? How does it arise in 
Statistics ? Find its sampling distribution and indicate its use. 

Ex. 10. If Xi, x„ is a random sample from a normal 
population N (0, 1); obtain the distributions of 

n n V(n— Hr 

0) S X/ s (ii) £(x,-x)°- mi) ~ - 


1 


u 


J/2 


Ex. 11. x v x 2f ...» x n are n independent sample values from 
normal population with mean /x and S.D.=a. Derive jhe proba¬ 
bility density function of the sample variance s 2 where 


n 


(n—1) 5 a = 27 (x,-xy 

i= 1 

Ex. 12. Derive the distribution of r and show that when 
n=S and if P ( | r | ^ c) = ft, 
c is a root of the equution 


c V{(1—e 2 )}+sin- 1 c+^- (a—1) = 0 

(B A. Hon’s Calcutta ) 

[Him. dp=— - 1 (i _ ri) <«-4), 2 d r , 

* (*■ ¥) 

P (\r\ ^ C )=\-P(\r\ < c) 

= 1 — P (—c < r < c) 

= ]-2P(0 <r<c ), 

since the distribution of r is symmetrical about r=0. 

= 1=2 j‘ f(r) dr, 

with /j *=5 


P= 




u ' >c) ^-wki)ll (l - r ‘ )ll2dr 

rumwi) [ rJ -~ 2 ri> ' li +l sin-r 


= a 



108 


Mathematical Statistics 


=» 1—2/ti [c (1 —c 2 ) l/2 -+-8in _1 c]=a 
^ c (1—c*) 1,2 -j-sin _1 c-f-(a—1)«/2=0 

13 8. Distribution of regression Coefficients. 

The line of regression of y on x is y=bx , 

where 

i 

S, 


6=eu=r 

Ox £ Ox 


b=r 


Si 


where S 2 and S x are sample S.D.’s of y and x 

fc a o 1 2 _/ir 2 Si 2 /cf2 a _ Xj 2 
^ o a 2 

=0a [i I (w—1)] variate 

where a, and <r 2 are the standard deviations of the variables 
x and v in the uncorralated parent bivariate normal population. 

Since E [p 2 (/, m ) variate]= -t—r- 

tn — 1 

E(b>) = 



The distribution of - g| , is 


a.,- 


dP— 


(b* —V * 

1 \ OS' , l bW \ 

B[b H"-J)Hi+*W/" 3 2 ) n;2 ) l r 


=> dP = 


a x aa n_1 db 


, — CO 


0 < b 2 < oc 
b < oc. 


B H. £ («-«)] 

Testing the significance of an observed regression coefficient 
by t-test. 

Suppose the sample yields the value b of the linear regression 
coefficient of y on x. We have seen above that 

b z a« 2 


■i 


* 

corresponds to & [1,1 (o-l)[ variate. Hence we define the 
statistic t by 

' 2 


n-l 


»3 n — i at 

Thus the statistic t conforms to the'/ distribution for (n— 1) 
D. F. But since and a a are not known, this formula is not of 
much use. 


Exact Sam. Distribution (/, Z&F etc) and test of siontfica 


nee 


109 


Since b=> r p. 


b 2 = 


r 2 S 2 


h*S 2 


r 2 


Si 2 Z(yi—YiY 2{)>i—Y t )* 

b 2 2 (Xi-x) 2 _ nr 2 S 2 i /ao 2 _ X, a 


2(y i ~Y i ) 2 /a 2 X 2 


— ft-i ~ 2 ~ j variate 

where y} is the estimate of /rom the regression equation 


y=r— x. 

<7* 

Observe 

r (y,-50*= 0'<- r,)*+r (r t -p)2 

=> nS 2 *=n (\-r 2 )S 2 z +nr 2 s 2 * y 

since £/, (y < -yi)2 = A^ a 2 v (I -r 2 ) 

=> gV r*)SJ . wr 2 ^ 2 
o 2 * <r 2 2 a 2 2 

=► X 2 „ _,=X 2 „ _ 2 -fX! 2 , 

because of additive property of X 2 . 

The statistic 

JU ( — 2) 2 (Xi~x)' 


t^b 


z (>’/—y ,) 2 


' 2 _b 2 Z( Xl -x. 2 
n-\ 2 (y,~ 




variate. 


conforms to the r distribution for„- 2 D.F. and maybe 
used to test the significance of the value of 6 found from the 
sample. u,e 

*' The significance of an deserved partial correlation 
coelllcient can be judged by the statistic 

<= V(l-r>) d - f —n—k —2 

cr'ipts.' U 3 Par ‘ ial C ° rrelation coefflci ent with k seeodary subs- 

bc iud^H K ,l he Significance ° f a rank corre lution coefficient can 
oc judged by the statistic 

H^.) ,/2 - "- — 2 

lation "zero."" 1 ‘ ha ‘ ‘ he ^ correla,iu '' «* m ™nt in the popu- 



110 


Mathematical Statistics 


Ex. Define the regression coefficient of y on x fora random 
somple from a bivariate population. Derive the sampling distribution 
of the regression coefficient when the bivariate population is normal 
with correlation coefficient zero. How would you utilise this distri¬ 
bution to test the significance of the observed r egression coefficient ? 

13*9. Snedecor’s F-distribution. 

Distribution of the ratio of two independent estimates of the 
population variance. 

Suppose we have two independent random samples with values 
Xi (i = 1 , 2, ..., Pj) and x'j )j— 1 , 2, n 2 ) 

Let their mean be x 1 and x 9 . The unbiased estimates of the 
variances of the population are given by 

g _27 ( x ,-^) 2 , a _ 27 ( s ',- A - a )* 


V= 


fli-i * * n 2 -1 


From these we have to decide whether the difference. 
J i 2 —* 2 * I > s sigmficaut, or whether the two samples may be regar¬ 
ded to have been drawn from the same normal population with 
vatiance a 2 

An appropriate test is furnished by the sampling distribution 

°* F = ~z> where sf and are unbiased estimates of a 2 obtained 

•*2 

from independent sample from the same normal population. Thus 

v a 27 (Xi-Xi) 2 

c 2 

F =-~- — 


Vi Z (x',-x 2 Y- • *■ V2 -" 3 1 

J 

vF £(x, ~ 5 ‘ )2/a2 x= 

;r * Vj) va,ia,e - 

v, 




Thus dP= 


B (K. K)( 1 +^- F 

„ H. K rK v l—2) 


+ v 2 ) 


,0 


cc 


dF 


B (iv,. iv,)( 1+^- J? ( v i + v 2 )- 0 < 


F < oc 


[Note : jj > s 2t this condition must be satisfied as F-test is a 
one-sided test 1 


Exact Sam. Distribution (t, Z & F etc) and test of significance 111 

Thi l b 'l 'l the required d 'S«r'bulion uf sflsj for v, and v, D.F. 
This dietribution is independent of o’ of the population. 

Since « 2 (/, m ) variate has the mode at ' 


o v a F . , 

P => —variate has the mode at ^L 1 

2 . ^ V 2 +l 

=* modal value of/r is .j V en b y 

f= y* (y»—2) 

v i (v* + 2) < 1 > v a] 

Since £ [ft (/, m) V ariate]= J— 

m— 1 

=> E [^-£1= 2 Vl 
L V 2 J }v 2 — f 


m-f 1 


E ^=~rr 

Whi The irohTr ent ° U and iS alwa > s than unity. 

The probability curve jor F depends on both v, aad v, 

obtaiued. Th: slbsUtmio^yields.^^' * * diS,nbu,ion is 


dP 


, —CO < z < oc 


B (*vi, £v 2 ) ( Vl e 2z +v 2 )*^ l + v *) 

"'hTt,';,',"“ ?“ »' '»• lm>r estimates of,,,.. 

Epothesl, 8 alUe0fFlhr0WS d ° ub ‘°" >be truth of the 

Aliter. Direct Derivation of F-dlstribution. 

Let X> v: and Z* y> be independenty distributed -/.’-variables. 
Their joint distribution is 


1 


f r (^/2) 

Introduce the new variables F and * by 

X 2 


fly 2 x Cv.j/2) — I —1x2,, 

V * ‘ </(|x* y ( * X 2 ) 


'2 


F='tl 


v, 7.* 1 • H '-* 2 ■ 0 <- T < oc, (J < „. < ^ 


112 


Mathematical Statistics 


72 = >A F, X* 
v, v 2 v 2 

a( x 2 , x* ) 


diF^ - dFd " 

=1 w — — FI dF dw=w — dF dw 


Thus 


C tY J2 


JP= - ' _ 


i-t-y—2 

, f W2)-l 2 


-* Ct 


Fw+w 


dF dw 


Integrating out the extraneous variable w, we get the distri¬ 
bution of F, 

f v \-(v 1+ v 2 )/2 dF 

P ~~ r(v 1 /2J/ , (v J /2) V v a / 

v i+ v » -i _/ l+^f^ 

*H( 1 + Ht)] 2 • 1 ' h 

/v,+v.\ - >+ v » 

- (^,h/2)-l, , *j_A 2 ^ 

A-'i/2) r(v a /2) \v a ) \ v„ l 


dF 


a tvx/2 


or dp=/; (F) dF 

v i* v a 


Recall 


f oo -* n-1 

e x dx , 

F (w, n)= f l x m -* (1 -a:)"- 1 d* 

Jo 

=r»i rn/r(m+n ) j 


Values of F have been tabulated such that 

a 


( Fa /» (F) dF=a 

Jo Vj, v 2 


Exact Smp. Distribution (t, Z&F etc ) and tests of significance 113 


. H*,) 

- Pr 


Vi + V, 

“F* 


dF 


...( 2 ) 

We use this integral in evaluating the moments of the F 
distribution. 


'I 


(yj 2, v a /2) 


(v,/v 2 ) Vl/2 


00 r.(V,/2)-l + 




1 


fr )"' 2 3 H (?) 


^(v,/2, v,/2) 

■ T(vi/2-{-r) r(v t /2—r) ^ jr 


v* \(v,/2)+r 


r < 


r( Vl /2) r v a /2) / • ’ - 2 ...(3) 

Note 1 By a simple change of variable the F (v lf v t ) distri¬ 
bution is changd into 

0i distribution. 

Make the substitution in {!) 


y- - V 2 

1 + V -*-F 
v 2 

• </F=* ^ 


F= 


va 


v 2 1 —X 


Thus 


Vj (1 —Jf)* 


11 


Vi Vj-fv, 

Kr-fef 1 


2 2 


-‘-1 


V| 


1 


*¥¥] 


( 1 -*) 


I 


f 2 




v, (i-xy 


dx, 0 < x<l 


Note for the t-distribution with v D. F„ the probability differ¬ 
ential is 


dp- 


d(t) 


O') 1 '* F(i,v/2)(H-/*/v) i(v + 1) 

, w *-» (,+£)-*< 


, 0 < / 2 < 00 


_L_ 1 / . , iM-Kv+i) 


B (i, v/2) •»*/• 


dm 



m Mathematical Statistics 

Table 4. For voriance ratio 5 and 1% ‘ points' of F 
v i is the number of degrees of freedom for the greater 
estimate of variance, and va for the smaller. 



18 51 
98 49 

19 00 
99 00 

19 16 
99 17 

10-13 
34 12 

9 55 
30 82 

9 28 
29 40 

7 71 

21 20 

C 94 
IS oo 

0 69 

1G G9 

G 61 
1G 20 

5 79 
13 27 

* 5 41 

12 0-5 

5 99 

13 74 

5 1 1 
10 92 

4 4 0 
9 7 3 

5 59 
12 25 

4-74 

9 65 

4-3 5 

8 4i. 

5 32 
11-20 

4 4G 

8 Go 

•1 07 
7 59 

5 12 
10 50 

4 26 

8 02 

3 80 
C 96 

4-90 

10 04 

4 10 

7-60 

3 71 

6 65 

4 75 
9 33 

3 88 

0 93 

3 49 
6 95 

4'60 
8 80 

3 74- 
6 51 

$34 

5 SO 

4-40 
8 63 

3 63 

0 23 

3 P4 
6-29 

4 41 

8 23 | 

3 *5 

G 01 

3 10 

5 09 

4 35 

3 10 

3 49 

5 85 

3 10 

4 ; 1 

4 24 

7-77 

3 38 

6 57 

2-90 

4 61 

4-17 

7-5C 

3 32 

5 39 

2-92 

4 51 


4-l)S 
7 31 

400 
7 08 

3 9G 
6-9G 


3 . 

*.s> 

6- IS 

3 15 

4-9S 

311 

sss 


2-84 
4 31 

2 76 
4 13 

2-72 

4 04 


19-26 

99-26 

0 12 
28 71 

6 39 
15 98 

5 if# 

U-C9 

4 53 
9 15 

4 12 

7 85 

S-34 
7 G1 

3-CS 
9 42 

3 43 

5 99 

3 26 

5 11 

3 11 

6 03 

3 01 

4 77 

2 93 
4 53 

2 87 
4 43 

2 76 
4 IS 

2 69 
4 02 

2-61 

3 83 

2-52 
3 65 

2 49 

3 56 


IP 30 T9 33 
09 30 99 33 

9 jl 8-94 
2fc 2 ♦ 27 91 

6 26 6 18 
1552 15-21 

5-06 4-95 

10 97 10 67 


4 39 
8 76 

3 07 
7-46 

3 69 
0-63 


5-SH 


e og 

3 i3 

5 6-4 

3 11 

6 06 

2 SO 

4 63 

2 83. 

4 M 

2 77 
4 25 

2 71 
4-10 

2-60 

3-86 

2- 53 

3- 70 

2 45 

3 51 

2-37 
3 31 

2 33 

3 25 


4-95 
10 67 

4-28 

1 8-47 

3-87 

7-19 

3 68 
6 37 

3-37 
5 80 

3 22 
5 39 

3 00 
< 62 

2-65 

4 40 

2 74 
4 20 

2 GG 
4 01 

2-60 

3 87 

2 49 

3 03 

2 42 

3 47 

2 34 

3 29 

2 25 

3 12 

2-21 

3-01 


19 37 
99 36 

8 84 
27 49 

0 04 
14 80 

4 82 
10 27 

4 1C 
8 10 

3-73 

G-84 

3 44 
6 03 

3-2S 
6 47 

3 07 
6 06 

2-85 

4 50 

2-70 

4 14 

2 59 

3 89 

2 51 

3 71 

2-45 
3 55 

2 34 

3 32 

2- 27 

3- 17 

2 16 

2-99 

2 10 
2 82 

2 06 
2 74 


10 41 

19 45 

99-42 

09-4G 

8 74 

8 64 

27 05 

26-60 

6 0) 

6 77 

14-37 

13 S3 

4-68 

4 53 

0 80 

0-4? 

4-00 

3 84 

7-72 

7-31 

3 57 

$ 41 

6-47 

6-07 

3-28 

3 12 

5 67 

5 28 

3-07 

2-00 

5 11 

4-73 

2-01 

2-74 

471 

4 33 

2 69 

2 60 

4-1C 

3-78 

2-'5 3 

2-35 

3 80 

3 43 

2 42 

2 24 


3*58 

2 34 

3 37 

oq 

4 - 

3 23 

2 10 
2 90 

2 09 
2-84 

2 00 
2 66 

1- 92 

2- 50 

1 88 

241 


3 18 

2 15 

3 01 

2 08 
2 8C 

1 96 

2 02 

1 -89 

247 

1- 79 

2- 29 

1 7Q 

2 12 

1 -65 

2-03 


19 50 
09 50 

8 63 
26 12 

6 63 
13 40 

4 36 
9-02 

3 67 
6 88 

3 23 

0 66 I 

2 93 I 

4 86 I 

2 71 

4 31 

2 64 
3-01 I 

2 30 
3-36 

2 131 

3 00 

2 01 
2 76 

1 02 

2 67 

1 84 I 

2 42 I 

1-71 
2 17 

1 62-f 

2 01 j 

1 51 I 
1 81 

I-S9 I 
1 60 

! 32 

l 49 


Explanation for the F-table. 

The ratio, F, tabulated is that of the larger estimates of vari¬ 
ance to the smaller The number v, of d. f corresponding to the 
larger estimate determines the column in the table , while Va 
e ermines the row. At the Intersection of the row and the column 
ore given two values of F. The upper is the value that will be 

Q X o < j ede Ti Wlth a prob f abl, “y 0 05 die lower with a probability 
0 01. These are nferred to as the 5 and l faints' of F. 




Exact Smp. Distribution (t, Z & Fete.) and tests of significance 115 


=A. (/*)** 

1, v v 

Thus the square of student’s t is simply distributed as F with 
I and v D F. 

Note 2. If we make the substitution in (1) 

1 


We obtain 
</p=const.x(F') 


F'= 


*\( v a/2)— 1 / . . v* 


/ 1+*-=- F') dr, occF'cO 


=h (F')dF' 

V 2> V 1 

Cor. //*v,=v 2 =n—7, the distribution of the F' = l/F is the 
same as the distribution of F with n —7 and n —/ D. F. and that 
the curve has to be traced from right to the left. 

The result (4) implies that 

Exercises 

Ex. 1. If sj 2 , j 2 2 are the sample variances based on n inde¬ 
pendent observations each from two normal populations with 
variances a**, <r a 2 . Calculate 90% confidence interval for 
in terms of c where P [F>c] based on n —1 and /»— 1 D. F. is 0 5. 

(M. Sc. Delhi ’54] 


c J L 


F(y a> v iK 


f] 


(Hint. P (F>c]= 05 => P 




f^<T]=o -° 5 


05 


Hence combining 


<F<c =-S0 


- 


\ o, 


So* to a 


90 


[f 

■ - g]~* 


]= 


'LL <lA_< c 

C J,* 


Thus the 90% confidence limits for are — ~ and c — 

**-* c s. * c2 


1 


Ex. 2. Show how the probability points of F (v 3 , v 1 ) can be 
obtained from those of F (v lt v a ). 

[M A. Agra ’67, M. A. Delhi ’60] 
Ex. 3. Prove that if x has the F-distribution with (m, n) D. 


116 „ . 

Mathematical Stctistics 

Ly" o d >0 h3S the f ' diStribUt ' 0n With ("■ «> D. F. then for 

Note 1 If o,. __ i [B Sc- Bombay ’68] 

• If we "lake the change of variable in (1) 

<> . p . , , '°g« F 

A. Fisher s 2-distribution is obtained. 

,V dz 


dp= 


— oo 


oc ...(5) 


B G v i. *v.) (v^+v^ ( v i + v 2 ) 

C=Z v 1 , v 2 dz ’ -<*><*<00 

Ex. Find the distribution of variance ratio F (with d f n 

“SIS’•“» 4.« 

given S by W ,ha ' ,hC Char ‘ CteriStiC fuDC,ion * (0 of r=J log, F is 

<t> «)= (M* 9 rg („. 4-B\x „ 

„ n '** / ARTT(K)— • •-* 

cnee, or otherwise, show that for large n, and 

f 1 2 


and 


[ 


Hint. 


Put 


*W-* U 

X^a n x I 

Var(Z)= - («T + n, L ) a PP rox - 

*«-r te „ w* 

J —OO "1» Wg 


MO ~/o f " a V/t, (OdE=E(F">) 

= n (»*-0 )/n. \d/2 ro 

„ \ n7 t Scc rcsu »* (3)] 

Hence C(/)=log ^ (/) 

[log ^2 —log WxJ + log ri(f,j + 0)4.log 

"•*«.. ,o, rtnJ2> 

log /-(n+i) <*log „ !~ (n+i) , og „_ n+log v(2w) 

(-— —) 

\ w a / 


EX ° Ct ^ Dis,ri ^ u, '°n (t, Z & F ere ) and tests ofsignifi 


cance 117 


* 1 =/W (—+ L+ L+ 1 \ , /1 , i \ 

\«. n i n a * it,*/ ~ * ( ni + ^- j approx 
ot 4. As v a CO, », F lends to be distributed as X* distribu¬ 
te using striilngs approximation. 


As 


2 ^ 00 


«! - \/(2*) 


+ ^V x -fv 2 —2 


V 2 



)• 


PF)' 

_ v i+v a —2 


V (2rc) e 


^ V| -f-v 2 —2 


v a —2 


\/(2*t) e 


v 2—1 
—2 \ 2 


) 


v, + v a -l 


J PF) 


9 v i + v 2 —1 

* ( v a“2) 2 


£- + 


v a —1 


2 


v a ~l 


I I+ C ^] 2 2 


2 (v,-2) 2 


(vx- 


a "' 2 [' + ^f[' + rf 3 l 


V-2 1 

2 + T 


( v x/2) v, v A 


Vx/2 


2 2 

v a e as 


vx/2 



118 


Mathematical Statistics 



Thus </p~ 


1 


v.> 


v./2 . v./2 V J--1 


Vl 


A v i/2) 
X 2 


2 V2 v, Vl/2 


■F 2 


</F as v 2 -> co 


Put F=— => dF =—d (* X 2 ) 

4 I >4 ' “ ' 


dp= 


1 


a x 2 )* (Vl 2) * - d (i x 2 ) 


AV2) 

=> X* distribution with v x D. F. 

Here X 2 distribution has been obtained as the limiting form 
of the F-distribution. 


Exercises 

1. Let Xt 2 , X 2 2 be independently distributed variates each 
having a chi-square distribution with n 1 and n 2 D. F. Derive the 

distribution of F=^r ? - [I. S. I. Calcutta ’53, *56] 

Xu"/ a / 2 

2. If .v is distributed as F with 2 and n d. f. show that 

( 2k \~*i 2 

1+ t) • 

rr j 2 

3. (a) Show that the distribution of F=^ 2 ° a 2 where S x 2 r 

S 2 2 are the independent estimates of c x 2 and a 2 2 respectively based 
on n lt n 2 degrees of freedom depends only on n x and n z . 

(b) Show also that the mean of the F distribution does not 
involve n v [M. A. Delhi 58] 

(c) When w x =2, show that the significance level of F corres¬ 
ponding to a significance probability p is 



Exact Sm P- Distribution (t, Z & F etc.) and tests of significance 119 


" 2 -1 ) 


[Hiut. With Wl =2 


[M. A. Statistics Delhi ’65] 


<//> = 


_ 2 + n s 

= fi(i!|s)(^)( i+ ^ f ) 2 ‘ /F - o< ' p<a ° 


'• p -f a* 


2 “ 




_2 

«a 


-]] 


variates, state with 1 

(0 

w-f-v-Fw, 

and (iv) 

Uy/2 


u 


<"> TO* 


(iii) 


V- 4- H' 8 
u- 


5 If 

normal * 1 ,* * 3 ’ are independent random readings from a 

samnli P °J 5Uja ! 10n With zero mean and unit variance, obtain the 
^mpling distributions of the following statistics : 

(a) u= - x i 2 +*2 Z 

x i*+x 2 *-t-x 3 * * 

(b) v=_ --*»±**+*3_ _ 

M (*i-A: f )«-F J (Ar I -2jr 2 +jr 8 )*4-.v 4 !S ]>« 

(M. A. Calcutta ’42] 

[Hint, (b) ,) => j 

i (*i-* 3 ) 2 ~/., 2 

ence v -—.^.3 f), using additive property of x 2 

distribution] 

6. Derive the distribution of Snedecor’s F with m and n d f. 
distributions* COnnection betweea Snedecore’s F and Fisher’s Z 



120 


Mathematical Statistics 


, J f F ? " 1 * no,es Shedecor’ F with m and n d. f. and if F„, « 

denotes its 100«% point, show that 

(a) F m ,„ is distributed like ifn is large. 

. distributed t». if m=l, where t«„ is student's 

* witn n d. f. 

frl P 1 

' ' 1 m* m a= p - 

7 If * a m * "* (l ~ a) 

• s l and 5,* are two independent unbiased estimates of the 

r? U fi , nT, S h Varia i ,8 r fr ° m ,he normal da ‘ a based on n, and n, 

U. 1., find the p. d, f. of statistic 




2 


J 2 2 


Show that 


(i) E(F) - 'll 


n z —2 


and (ii) P[F^Fo]-( y* 

if /»i «2 and ** 

from a normal population, obtain the distribution of 


Show that 




(i) ? (F > a)=1 _ P ( F >_L\ 

8Dd <*> * * s student’s ,*of n d. f., t* is F (l* n). ° 

If x, and x a are independent variates distributed in the 

same way as dF= e~* dx *^>n ok • r 

’ *>0, show that x,+x 8 and ■£- are inde¬ 
pendently distributed and that 3- i, F with (2, 2 ) d f’ 

•ion '• - .be , distribu- 

( M Sc Bo®bay *8. B Se Hons. Delhi 67) 

'1 .Sr,; r.s z&vsrxz 

srs?.-—- - -' ‘™ 


Exact Sm P' Distribution (/, Z & F etc.) and tests of significance 121 

(i-p) s* u 

tv= (l $2 has h n - 1 , r _! ( F ) distribution, 

where for /= 1 , 2 ,..., n 

and («-l) 5 2 „=27 (*/-//)*; (/i-l) 5^=27 (vy-T) 2 

flu =27 *//, nv =27 v, 

JHlnt. Note that u it v, are independent normal variables 
wit means and variances 2o a (1 -fp), 2p 2 (1 — p). 

Hence fo~~P s *» . B<l (»-l) 5 1 . . 

2a 2 (1 + p) Q 2&' (i-6 \ are independent X 2 ’s each with 

(n-l)d. f.] 

13 10 Applications of F— distribution or Tests of significance 

based on F distribution. 

(a) Two samples with values are 

x, (i=l, 2,..., tix) from N (jj^, ai 2 ) 

X J ( J =1 » n 2 ) from N (^ 2 , a a 2 ) 

Then if si ^7 (*r—where X|=mean in the first 

sample 


second sample 


J * a —j 27 (*/—**)*» where x 2 = 


mean in the 


SS/oS Sy* Oj 

‘ **/«.* V * a, 


t 

2 


is distributed according to h if\ 

n x — I f — 1 

(b) Suppose k samples are obtained from 

N (/*i» <**)» N (fxj, cr 2 ),..., TV (/z*, o 2 ) respectively. 

Then 27 . T P ■*'* = x a 

c> — * «-*» £ni = n, 

where n/ is the size of the ith sample. 

Since — 

H 2 Hi ^ 


rj*ln, 

■ I 

/= i <*7fl< 

-7T^r^ = ~- • - ”' 27 (jc/~ w )» 

* —k/n—k /.*„_* k 27 (/r ( — 1) 

i 





122 


Mathematical Statistics 


( c > $ '=*!■ (f). 

where r is the correlation coefficient in the sample of size n 
from a bivariate normal population with p= 0 . 

(d; Testing for significance of correlation ratios of y on x 

?) 2 (N -h) 

(l-il*) (/i-l) A_1 ’ N " h ‘ 

where h is the number of arrays of y in a sample of size N. 

We reject the hypothesis if ^>Fq. 0 ^ 

is calculated from 

S S 0'/7-p/) 2 = (l -rj« 2 2 (y u -y)* 

, ' J * J 

where ( x , , y,j) (i = i, 2 ,..., //, j=l, 2 ,..., n t ) 

denote a random sample of size N (=2n f ) from a bivariate nor¬ 
mal population. 

(e) Testing for non-linearity of regression : For a sample of 

size N arranged in h arrays from a normal population the 
statistic. 


v N ~ h 

1 — V 2 h—2 


'A-2» N-h 


(f) Testing the significance of an observed multiple correlation 

Let R denote the multiple correlation coefficient between a 

variate and p other variates in a random sample of size N from a 

(P-H) variate normal distribution with multiple correlation co- 
clncient zero. 

_5L_ *-P-i P 

l-R- ' p - H-r-i- 

. 1 '[ iS kn Z" ,ha ‘ ,he mean diameters of rivet, produced 

by two firms A and B, are practically the same, but the standard 

t^ZnZdfZ J" " ^ Pr d " Ced bv A > the S D 

f 16 rnets manufactured by firm B, the S U is 
3 8 mm. Compute the statistic you would use to test whether the 
products offirm A have the same variability as those affirm B. Also 
Slate how you would proceed further. (B . Sc. Delhi 68) 

Here/; 1 = 22, 5i = 2-9 mm. 

// 2 = 16, 5 a =3‘8 mm. 

The two estimates of the population variance furnished by 

the samples are g =8 805, and , 5'393 respectively. 

1 he second is larger, so that 


Exact Smp. Distribution (t, Z & F etc.) and tests of significance 123 


r 15-393 , 

g ^()3 — 1 ’ 74 , Vj —15, Vo—21 

If ^<^q.q 5 » the value of F is not at all significant and we 


conclude that both the firms have the same variability in their 
products. 

Ex. 2. In one sample of 8 observations the sum of the squares 
of deviations of the sample values from the sample mean was 84'4 
and in the other sample of 10 observations it was ]02'6. Test whe¬ 
ther this difference is significant at 5% level, given that the 5% point 
of F for v,«=7 and v 2 = 9 d f- is 3 29. (I. A. S. 65) 


84-4 

Here s i z =~j~ = U*057 

. 102*6 „ 
s 2 t= —r;—= 11 1 4 


so 


that = 


V 2 
■>2 


11*4 


057, 


For Vj=7, v a —9, f qo5 =3-29. 




The difference is very far from being significant, and the 
samples may well be drawn from the same population. 

Exercises 


1. Explain why the large variance is placed in the numerator 

of the statistic F. Discuss the applications of F test in testing if 
two variances are homogeneous. [B. Sc. Meerut ’68] 

2. (i) Give the applications of / and F distributions in tests 

of significance. (B. A. Hon’s Delhi *67] 

(n) Write a short (critical) note on F-distribution. 

[B .A. Hod’s Delhi '66, ’65] 

3. Two samples of sizes 10 and 12 are drawn from two 
normal populations yielded the following results : 

10 12 

£ (x/-x) 2 =120, Z (y,-yy-=3\4 
1 1 


Test whether the two populations have the same variance. 

4. The correlation coefficients between heights and weights 
ol 23 girls and 28 boys of a certain college were found to be 0-5 
and 0-8 respectively. Can it be regarded that the heights and 
weights of boys and girls of this college are equally correlated ? 

5. A random sample of 150 pairs from a bivariate normal 
population when grouped in 15 array’s of/s gave values r=0 4 



124 


—“•'•twuiaui otansitcs 

tests 9 Wha ‘ 3re " eant by ,ar « e sa “Pl® ««t S and small sample 

respectively provide unbiased estimates V and V of “. and t? 
Obtain the p. d. f. of the ratio 

$ 2 ** 

How would you test the hypothesis ffi *=a 2 *. 

of v a I r D / and ° m S r mp,CS ° f Sizes 10 and 15 ’ the * ^biased estimates 
financesi are found to be 5 and 9 respectively. Can we reason- 

ably conclude tuat the population variances are equal ? 

8. If ii and v are independent X’ variates with m and n d f 

respectively. Show that *=n+„ a nd,=^ are independently 

distributed. What is the distribution of y ? Show how this 
distribution can be used to compare two variances under suitable 

9. If * is a standard normal variable, find the p. d. f. of**. 

Find the m. g. f. of ^ x,\ where x, (/=], 2.„) are indepen¬ 

dent standard normal variables. What is the distribution of 

Z xf, called ? 

1 

How will you use this to test the hypothesis „*=<,„*■ given a 

random sample of size ■ from a normal population w’ith zero 
mean and variance 7 H “ Wl111 zer0 

. a '"a H ,?l W ° U ' d y ° U Use SIudcnt ’ 3 '-'«t and Fisher’s z-test 
to decide whether the two sets of observations 

* ;17 > 27, 18, 25, 27, 29, 27, 2.1,17 

y: 16, 16,20, 16, 20, 17, 15, 21 

indicate samples diawn from the same universe ? 

11. Two random samples drawn from two normal^oplfl”;^ 

are . 

Sample I : 20 16 26 27 23 12 18 24 25 19 
Sample II: 27 33 42 35 32 34 38 28 41 43 30 37 


Exact Smp. Distribution (t 7 A v <>* \ j 

( • tc.) and tests of significance 125 

^ and 

[I- A. S. ’56J 


[Hint. 


n A . . * A 


Po r , = n>v r n, Vl= , 

, 31 r he 2 SK«r^ ^ >rr ela(ion coefficient 
samples of« pairs G f va j f correlation r in random 

Wi,h COrre ' a ' i0n 0 iiven by Fisher “form"” 3 ' P ° PU,a,i ° D 

do— (1— p*)(»-l)/* , 

^W^TT- ( 1 -/•*)("-«)/* 4 TL. / cos -1 (~rp) \ . 

This distribution is far from m , ( ' ? IT ' V( 1 ~ r ‘^ > 

18 very skew in the neishhn h T 3 ' The Probability curve 

samples. Thus“here is need r° P=± ' eVe " fjr '•«* 

ever, has shown that the transformation' * ° f * h ° W ' 

Z=i 108 f~- log |±f 

varia r w rr s,ributi - appr ° xima,es ,he ~ 

,on *(c. and which tends rap.dly ,o normality, 

mcreeses. Thus the S. E. of z does not invo.ve r 

The d?,n r b? f0rma,,0n WhCn p =° 
normal population's ° f ' r3ndom sam P ,es an uncorrelated 


dn= A} -r*)ln- 4 )/ 2 ^ 

^7 bT&=2jT'- 1 


i 


Fisher’s transformation is 

l-hr 
l —r 

e 2z ~l e*- e - 


z ~\ log 


e 2t 


1 -hr 
1-r 


£*M-1 =tanh z 

e normal distribution has the asymptotic variance ( l -P»)» 

••• The necessory transformation is ( "~ U 

F(6\= ( C V(n— D . 

) l_ p * P“ tonA- 1 P choosing c suitably 

wm * 1 


F (/a/i-i r ) 


1 

/i-i 



126 


Mathematical Statistics 


Hence 


Thus dp= 


dr= sech 2 z dz 

sech" -2 z dz 


2 )) 

— A Z 2 


Since sech zc^.e 

, -Hn-2)z* 

dp ace ** v dz . 

mately normal with variance 


the 

1 


n—l 


distribution of z is approxi- 
A better approximation is 


however, Var (z) = —r 

v ' n— 3 


Table 5. Fisher's transformation of r 

values of r for specified values of z at intervals of 0*02. 


z 

0 00 

002 

0*04 

0 06 

0 08 

1 

o-o 

0 000 

0 020 

0 040 

0060 

0*080 

0-1 

0100 


0 139 

0 159 

0*178 

0-2 

0197 


0-236 

0*254 

0*273 

0-3 

0*291 1 

wmM 

0*328 

0 - J 45 

0*363 

0-4 

0*370 



0*430 


0-5 

0 * 46 ’ 


0*193 

0 508 

0*523 

l )*6 

0*537 

EMI 

0 565 

0-578 

0*592 

0-7 

0*604 ‘ 

0*617 

0 629 

0 6 H 


0-8 

0 * 64 

0 675 

0 686 

0 6’6 

0-706 

0-9 

0*716 


0-735 

0*744 

0*753 

1-0 

0*762 

0 * 77 'J 

0 * 7)8 

0‘786 

0793 

1-1 

0*801 

wmm 

0*814 

0*821 

0*828 

1*2 

0 - 8’4 

WEM 

0 * M 6 

0*851 

0*857 

1*3 

0*862 

wMm 

0-872 

0 876 

0 * 8 sl 

1-4 

0*85 

■ail 

0*894 

0 898 

0*902 

1-5 

WEtm 


0*912 

0*915 

0*919 

1*6 

0-922 

0 925 

0-928 

0*930 

0 933 

1 7 

0-936 

wEEM 

0-940 

0*9 43 

0*945 

1-8 

0 947 

B Wm 

0-951 

0 953 

0*955 

1-9 

■SB 

m 

0 960 

0*961 

0 963 


Reproduced by permission of the author, Professor R. A. 
Fisher, from his bcok on Statistical Methods for Research workers 

Tests of significance based on Fisher’s z transformation 

By means of the statistic z we may test whether an observed 
correlation coefficient r differs signif cantly from some given value 
of p of the population , or whether the two values of r, say r, and r 2 












EXaCt ^ Disl " butl0 " O.ZAF etc.) and tests of significance I 27 

*Z Sam P'“ dlfT - significantly, From 

from fn I„H i, Va ' UeS ofzand ? can be determined 

( and it is easy to decide whether the deviation z-Z is 

significant for the normal distribution of variance - 


Fishe^ha^nubl'isheH 6 nec ** sity of calculating z in every case, 
correspond tn c a a * ab C Setting out the values of r t which 
of 0 01 PCCI 6 Va,Ues of z rangin g from 0 to 3 at intervals 

°f l^pairs^ofobservation*^i C * enl ° f ° 72 " <* 

‘ZlLT, ITT" 

(a) Here «=29, r=0 72, p=0 8 

‘=1 log. j~~ —11513 log J0 j-±r= 0-907 

Z=llog, , L ±|=I15 13 log 10 
! z — K | = 0193 P 


y W=~=> S E. of z = _! 


26 


= 0*196 


... i£^l = 0,93 _ /< 26 > 

s - t. o I9v*~ 9 ^ 5 wh ich is less than 2 

So far VthiJ?,!, 5 ' L ** ' S " ot significant, 
might very well be 0 8 8 ° CS ' correla "°“ in the population 

(b) * d “ cial limits for p are given by 

=> o 07 ? 1' 96 f- E -) =, ‘96x0-196 = 0 384 
=> 0 i07-0-384<C <0 907 + 0-384 

=> 0’523<^<I-29i 

Hence from the relation K= h log I± p H r, Pr 

J ,og j__ p » df 'cr consulting the table 

Ue can find p. 

tlon that p=0 5 in t/u> n ^ ^ ] °^ ne cons, stent w ith the as sump- 

for p. ‘" e P ° PU,aUOn ? *•"> obtain 93% fiducia, Innits 

[H,0 ‘- V Z = °’ 87 . C=0-55, z- C=0 32 
’ ar ( z )=s£ => S. E of z=$ 



128 


Mathematical Statistict 


z ^ 

=—=^-c=l«6<2 => z—£ is not significant. 

□ . t. 

| z-Z |< 1-96 (S. E.)=0*392 
=► 0’48<£< 1*26 => 0-446<p<0*85IJ 

To test the sign ficance of the difference between the corre¬ 
lation coefficients of two independent samples. 

Suppose in two independent samples of n\ and n 3 pairs of 
values the correlation coefficients are found to be r x and r t respc^ 
tively. May the samples be regarded as drawn from the same 
population, or, Is the difference | r x —r % | significant ? 

Assoming that the samples are from the same normal 
population the difference | z x —z t | is normally distributed with 

Var (z,-z 2 )=Var (z,)+Var (z a )= — -f-i-r. 

/#X- J H|— O 

Ut Var 3 +^= 3 ) 

z x and z, are given by 

2,=i log \=ir- log T=7; 

If | z,—z a |<? € , the difference is not significant at the 5% 
level and ihe assumption that the samples are drawn from the 
same population, or from equally correlated normal population is 
not discredited. 

Ex Given n x =23, r x =0’5 

r/ a =28, r a =0 8 

Are r, and r % significantly different ? 

We assume that the two samples are from the same normal 
population, the S E. of z x -z 3 is 

«=V'(b^4- j J)=v'(0 09)=0 3 

The table gives z,=0-55, z,= l 10. 

Hence ~-~ c 2 ^ 3= ~~ C=, ^J~ = ^^3 which is less than 2. 

I z i— z z i<2e and thns the difference is not significant at 
5% level. Therefore the hypothesis is not discredited. 

Combination of estimates of a correlation coefficient 

Suppose there are k samples having n Xt n v .., t n* pairs of 
values. From them the correlation coefficients are found to be 
r x , r a ,...r/, ...r*. In other words, the ith sample with ni pairs of 
values has the 'correlation coefficient r, (i-l, 2,..., k ). We may 


EXaCt Smp - Dist 'i<> U tion (f, Z&F etc) and tests of significance 129 

wish to instigate whether the k samples may be regarded as 

ones) c”,‘ h Same DOrmal popula,ion or ( e q u a»y correlated 

estimate of aS T P . t,0n "' e ™ay like to obtain a combined 
estimate of the population correlation p. 

the the homogeneity of the estimates r„ we assume that 

the samples are drawn from equally correlated populations We 

obtain values z, of variates, by Fisher’s transformation which 

we know are approx,mately normally distributed about a common 

mean, with variances The estimate of their common ? is 

provided by z where 


2 («i — 3) Zi 


_ / 


2 (///— 3 ) 
i 


Then since TV 


(<■ sa 


f (n/-3) (z,-z)« is variate 


t- 


Zl — Z 


V 1 ) 

The significance of the calculated value of this quantity mav 
be ascertained from the table of XK We may express the sum 

2 (»,-3) (z,-z)*=T ( „,_ 3 ) z ?_ v s ( „,_ 3) 

* / 

( n ‘- ? ) V-f-T (n,-3) z,] 2 /27 (n t —3) 

1 i i 

If the calculated lvalue of 27 (/i,—3\ fz_,o c 

a« a vain** Y 2 „ .. v/ „ ' ' 1 2 ' ,s not significant 

as a value of X *-i. the estimates r, of the correlation in the n 0D u~ 

a ion may be regarded as homogeneous. In this case z is an 

estimate of the true value ? corresponding to the population 

coefficient p The required estimate of p is given by 

z=i log, ji? =*■ p=tanh z. 

Ex. |. Given «,=2I, n t =30, n s =39, n,=26, „ l=35 

r,=0-39. r, = 0’61, r.-O-tt. r 4 =0>4, r 5 =0-48 

Muy these estimates of 9 be regarded as homogeneous ? If so 

find an estimate of the correlation in the population. 1 ’ 

Ex 2. The correlation coefficient between daily ration of 
green grass and rate of growing calves on the basis of observation 


■] 



I }0 

Mathentatical Statistics 
regarded homogeneous ? If so estimate the common correlation 

coefficiem - [p=-1894] 

Ex. 3. The correlation coefficients between wing length and 

tongue length were estimated from 2 samples each of size 44 to be 

0-731land 0 690, Test whether the correlation coefficients ate 

significantly different or not. If not, obtain the best estimate of 

the common correlation coefficients. [B. Sc. Delhi *66] 

Ex. 4. Test for equality of the correlation coefficients 
between the scores in two halves of a psychological test applied 
to different groups of sizes 30, 20 and 25 if the corresponding 
sample values are 0 63, 0 48, 071 respectively. [B. Sc. Bom. *68] 


14 

TEST OF SIGNIFICANCE BASED ON 

CHI-SQUARE DISTRIBUTION 


14.1 X 2 as an approximation to the multinomial probability 
Distribution. 


Let E l% £*, ...» E k be k mutually exclusive and exhaustive 
events for a random variable A', so thai 

k k 

E P (£;) = !. We write P(E,)=p< and therefore E 1 
/==1 /=! 

The probability that in n independent determinations of X 

the event E x will occure exactly times, exactly m x times. 

E k exactly m k times, with 

k 

E m t =>n 
/=1 

is equal to 




m 2 !.-m& ! 





Assume that n and all m, (/-1, 2,..., it) are so large that all 
factoi ials may be replaced by the corresponding values of Stirling 
approximation. We then obtain the approximation 

p _ (2 *)'l* n n+i e-« _ 

(2t)*/* -(">i+<”>+• ••+«*) 


Vn mi n/ n * n ™ k 

*Pl Pt ••••Pk 

-( S ')"‘ +1 .■*>» 

If we consider the simple alternatives, E, with probability p it 

no r n *5 probability 1— Pi and consider our n trials as an 

«*fold repetition of this alternative, we obtain 

^ £(m,)=/ip«, Var ( m i )=np, (l-p<), /=!, 2,..., k. 

Clearly m, is the observed number of outcomes £,, np 4 is the 
expected number of such outcomes. 



132 


Mathematical Statistics 


Let / _ mt-”Pi _ m,~ np ( 

Vi n Pi (L-Pi) o { 


then ti is the normalised random variable corresponding to mi. 
Obviously a f is the S. D. of w,-. 

Now mi=npi-\-atti 

tkOk \—tk°k—npk~\ 
"Pk I 


• P 

• • ■* 




1,2 -(>+ 
(2777i)(*“>>/3 (p 1 p t ...p k yH 


k 

Now log U=- £ (f,*+„/,,+*) , 0 1+ 'i2 \ 

i=l \ np t I 


U 

= T , say 


k 

=—2’ (r i vi+itp < +§) 
i= 1 


x s Lizit!!*t\' if a* 

r=l r \np t ) * n p t 


< 1. 


bv ° ^ its va,ve V{npi (1— p,)} and ordering the terms 

by decreasing powers of n, we get 

k 

log U= ~f_ J '• v'fp. (1 — p,) >!>/- 

k 

+ f =1 tt‘’«li-q,l t ’)+R(n), 9( =l_p, 
where /?(«) contains only terms with «->/«, ir>, 

Since ^ ■/{/>! (I-p,)}„>/• 

k k k 

it '=1 i=l 

aild /=! 9i-9A s )=-J r 

We rewrite/> in the form ' =1 

^ ^ C„.exn. i 5 — nDi) u 1 — 


•«p. f-i 27 1 - - 

L " /= 1 np t J C » e 

k 

where X*«= 27 ,nii HEi |2 , 

f=l "P. and C„=(2 w ,)(i-«;s ( Pl ... ri )-i* 


133 


TeS ‘ S ‘S»!ficar,ce Based on Chi-square Distribution 
The form of X 2 in common use is 

|t 

X*ra £ K ~~ *<)* 

i-1 e * ’ 

WhCre 0,=observed frequency=m, 

^ expected frequency =np t 

Since^ ( 0 ,- ei)= ^ (m,—np,)=0> X- has (k- 1 ) D.F. 

The alternative form for X* is 

(?->-+«.) 

"£,(-5-) “■* 5irltc r °,=z e,-n 

F " ” a "” a »»-* egs *. ft . „ 


dp= 


1 


2<*-2>/2 p tk—\ 


_ k —I 


^ / Ln! j w (X 2 )~2 1 d ( X2 ) 


0 < * 2 < oc, D.F.=4:—1 

r which we denote by v. 
n->«, p W < V) 




WJ 


*•%-**• *-1 

(X*) 2 


-1 


</X*. 


v ^“.rr ■sr ^n s t - **>«- *. 

all /. ' < x °) as soon as ^ 10 for 

fact and theory* Sima/m) dene'T of of “nrenpondenca between 

“-'Sr;".—-’ % 

-—*=52 



134 


Mathematical Statistics 


pfes .This property of the /.* distribution permits it to be used on 

a wide variety of problems involving a comparison of observed 
and theoretical frequencies. 

Table 6 Values of X* with probability P of beings exceeded in 
random sampling v=number of degrees of freedom. 


0-99 

.0-95 

0 50 

0*30 

0-20 

0 10 

005 


0 0002 

0 004 

0 46 

1 07 

1 64 

2 71 

3*84 

f, (A 

0 020 

0 103 

1-39 

2 41 

2-22 

4-00 

5-99 

9 21 

0 115 

0-35 

2 37 

3-66 

4 84 

6 25 

7-S2 

1 \ 

0 30 

0 71 

3 36 

4 89 

5 99 

7 73 

9 49 

13 28 

0 53 

114 

4 35 

0 00 

7 29 

9 24 

11-07 

15 09 

0 87 

161 

5 35 

7 23 

8 56 

1U04 

12 59 

16 81 

124 

■ C c 

2 17 

0 35 

8 38 

9 SO 

12 02 

14 07 

I9-4S 

105 

2 73 

7 34 

9 52 

11 03 

13 36 

15 51 

20-09 

209 

3 32 

8 3k 

10 00 

12-24 

14 63 

16 92 

w-t 07 

2*60 

3 94 

9 34 

11 78 

13 44 

15 99 

13 31 

23 21 

3 05 

fl jp 0 m 

4 58 

10 34 

12 90 

14 03 , 

17 >3 

19 68 

24 72 

3*57 

A 11 

6 23 

11 34 

14-01 

15 81 

13 55 

21 -03 

2C-22 

411 

A /V 

5 89 

12-34 

15 12 

16 93 

19 31 

22-36 

27 69 

4-G6 

6 57 

13 34 

16 22 

IS 15 

21 06 

23 68 

29 14 

5*2i 

7-20 

14 34 

17 32 

19 31 . 

22 31 

2500 

30 59 

3 81 

7-96 

15-34 

18 42 

20 40 

23*54 

20-30 

32 00 

0-41 

i 

8-67 

16 34 

19 51 

21 62 

24 77 

27-59 

33 41 

702 

0-39 

17-34 

20 60 

22 76 

25 99 

28-87 

34 80 

7-63 

8 20 

1012 

10 85 

18 34 

19 34 

21 69 

22 78 

23 90 
25 04 

27 20 

28 4 1 

3014 

31-41 

36 19 

37 57 

8 90 

0-54 

10-20 

10 80 

11 52 

11 59 

12 34 

13 09 

13 85 

14 01 

20 3* 

21 34 

22 34 

23 34 

24 34 

23 86 
*24 94 

20 02 

27 10 

28 17 

26 17 

27 30 

28 43 
£9 55 

30 68 

29 62 

30 81 

32 01 

33 20 

34 38 

32*67 

33*92 

35*17 

36 42 
37-65 

38 93 

40 29 

41 64 

42 98 

44 31 

12 20 

15 38 

25 34 

29 25 

31 80 

35 56 

38-89 

45 dt 

1 *» on 

13-50 

1 .1 9A 

IwlO 

10 03 

n.T i 

26 34 

27 34 

30 32 

31 39 

32 91 

34 r*3 

36 74 

37 92 

40 II 

41 34 

?7 96 

49-26 

• i 

14-0% 

I 1 7 1 

1 W 4 fk 

28 34 

32-40 

35 14 

39 09 

42-66 

40 59 

• 

1 o 41* 

34 

33 53 

36 25 

40 26 

43 77 

50-«> 


30, v'(2X*)— \/(2v— 1) may be used as a 


Note. When v 2 
normal variate. 

14 2 X 2 —Test as a Test of Goodness of Fit. 

The Pearsonian test X* is applied primarily to test the agree- 
ment between hypothesis and observation Given any set of 

nrn!'. w * ^°. rn ? aJWP^thesis about the inner working of the 
processes which might have given rise to these o^ssrvetions From 

the supposed hypothesis we may be able to work out the frequen¬ 
cies corresponding to the catagories or variate values which were the 
subject matter ot our observations. 

Let o denote ihe observed frequency for any category or mea¬ 
surement ar.d let e denote the corresponding freqnency that may 


Test of Significance Based on Chi-square Distribution , 35 

be deduced from our hypothesis. If theory and fact are in 

° ,ho “ ,a “ «» 
dna u —e=0 m every case. 

As a matter of fact 0—e neither is nor need be exacrli/ 
r the sampling fluctuations would certainly give rise to ^ 
differences between 0 and e. The question is only, if ThisVffZ! 

errors DOt ”° re thaD Wha ‘ ““ P ° SSibly be ex P lained by sampling 


Pearson takes X*=27 

i=l 


(o,-e t y- 




as the measure of discrep¬ 
ancy between hypothesis and observation. Thus X 2 is the sum of 

^served minus expected squared over expected.” Tbe f r eq“n 

hvnnfhe**' *h' ° k haVC a muIt,nomia l distribution which under the 
hypothesis has parameters a=£</ 27£/ and n=Ze Th* . . 

distribution of X> is discrete. Fortunately thefe is an l? 8 ' 
tion for the hypothesis distribution of X* and the annrov 
is qmte good for sample sizes large enough that the s^The 
Pproximating distribution is the X 2 distribution with k— 1 D F 
Zero value of X 2 would correspond to exact aareemen,’ V 
expectation, whereas increasingly large values of x« may be thought 
of as corresponding to one’s notion of poor experimental resulfs 
t customary to select a value of V such that 

p (/= > V)=-1 _ 

2(*-i)/2 p \ f j 

x j v «~ 1X, (X*) < ‘-»)/» < /(x«) =005i 

p» ^ 

ns a critical value for judging significance at the 5°/ level s.. 

z 1 t -*« .1 

only. If P CD > C) is greater than 5% wc'expecf ag^eM e "°” 

nient between fact and theory; if however- it is !e« th c! dgree ’ 
that such a value is not 1/kelv ta h . * 1 S ess than 5 /« we say 

Note l. Expeitence and theoretical inves.igations indicate that 

the approximation of S to the y* k 

i e t to the X distribution is usually 




136 


Mathematical Statistics 


satisfactory provided that e t ^ 5 and k > 5. If some of the cell 

frequencies e, do not exceed 5, they may possibly be combined 

with other adjacent cell frequencies until the condition is satisfied. 

In and such reduction of course, it is necessajy to calculate the 

value of'/. 2 and to determine the number of D F. after the 
reduction. 


Note 2. (On degrees of freedom). For samples of fixed size, 
the number of D.F. is clearly less than the number k of classes. 


For the sum of the class requencies is constant ( 27 o,=/i ) and 

,. . /=1 
this corresponds to a linear constraint on the variates o t . Further 

to determine the theoretical class frequencies e t it is sometimes 

necessary to estimate parameters of the population from the data 

o t e sample. For instance, in testing the hypothesis of a normal 

population, it may be necessary to estimate the mean and the 

variance of the population from the sample values Each estimate 

ot a parameter obtained in this manner corresponds to the intro- 

duction of a linear constraint. In calculating the number of D F. 

ot /. each contramt introduced in this manner must be recognised. 

It is shown that for each independent linear restriction imposed 

upon the obseivations o,, the value of the parameter v (=*-l) 

is decreased by unity, otherwise the function ,s unchanged 


Note 3. The X* test can even be applied when the hypothesis 
has unknown parameters. The care calculated as before but 
with the parameters of the hypothesis model replaced by their 

“7 « t es,ima,es - The only change in the distri- 
button for/.* is that the DF is reduced by the number of 
parameters estimated. Suppose we test the hypothesis'“That the 
sample came from a normal distribution. Under this hypothesis 
there are two parameters, n and a*, which are replaced by their 
maximum hkelihood estimates. C alculating the ‘probabilities’ for 


the different intervals we obtain the statistic Z 2 = r 

i=l 

hypothesis distribution is approximately X. 2 with 

k— 1 — 2 = A — 3 D. F. 


(Oj — Cl) 2 
ft 


.The 


Conditions for the application of X. 2 test 

(a) In the first place, n must be reasonably large, otherwise 
pi will not be normally distributed, n should be at lesat 50. 



Test of Significance Based on Chi-square Distribution 


137 


(b) Each expected frequency ei should be at least 5. When e t 
are too small, they may be grouped together. 

(c) The constraints must be linear e g. 2 oi—D e ( =n. 

Ex. 1. 200 digits were chosen at random from a set of tables. 

The frequencies of the digits were : 

Digits 0123456789 
Frequency 18 19 23 21 16 25 22 20 21 15 
Use the X 2 test to assess the correctness of the hypothests that 
the digits were distributed in equal numbers in the tables from which 
these were chosen. [M. Sc. Agra ’47] 

On the hypothesis to be tested, each digit occurs the same 
number of times and is therefore expected to occur 2 fg=20 times. 
X 2 = 2 S [08-21 )*+(19-20)*+(23- 20) 2 +(21 — 20) 2 +(l6-20) 2 
+ (25-2 ) 2 + (22 —20) 2 +(20—20) 2 + (21 — 20) 2 +(15—20) 2 ] 
= *o [-* + 1+ 9+1 + 16 + 25+4 + 1 +25] = !o=4 - 3. 

Degree of freedom=number of independent observations 
10—1=9 /. e. v=9 here. 

With 9 D F., the probability of X 2 >4*3 for sampling chance 
is ’89 or 89% which is quite good as it is much greater than 5%. 
Therefore the hypothesis seems reasonable. 

Ex. 2 1 welve dice were thrown 40^6 times and a throw of 6 

was reckoned as a success, the observed frequencies were as given 
below : 

Number of 0 1 2 3 4 5 6 7 <£ over Total 

Successes 

Frequencies 447 1145 1181 796 380 115 24 8 4096 

Find X 2 on the hypothesis that the di e were unbiased and hence 
show that the data are consistent with this hypothesis so far as '/? 

test is concerned. [M. A. (Eco. Stat) Delhi ’56] 

The probability of throwing a 6 with one dice is p = £, and of 
throwing a.»y other number is q= 1 -p= £ The successive frequ¬ 
encies arc the various terms in the expansion of N (</ + p)" 

i.e. 4096 (Hi) 12 . I hey are 

409o (,«)”; 4096 “c, (?)“ (J); 409 .“c a ($)'« .etc. 

On simplification they are found to be 

^ : 459, 1102, 1212, 808, 364, 116, 27, 8; and observed frequ¬ 
encies are : 

o: 447, 1145, 1181, 796. 580, 115, 24, 8 



138 


Mathematical Distribution 




X a ^=i7 ( 447-45 9)* , (1145-1 

* 459 ^ liyJ1 


102)* . . (8—8) a 
+ •..+—g — 


= 5 811, D. F. v=8— 1 =7. 

<herIfV % J alUe0fX2f0rV=7is 14 07 ff om the table. Since 
coL,? . .. e0fX2< - 5 8,1 is much less ‘»an M‘07, the 

" 13 ‘ hat ,he dice were UDbia ^d so far as the X> test is 

concerned. 

f n U ^ Fst °f Wars °f modern civilization provided the 

to TJ ataf ° r the yearS 1500 ‘ 193 L Fit a Poiss ™ distribution 
e " ata an d test the goodness of fit , 

No. of out breaks 
in the year x : 0 1 2 

No - °f ™ch years f : 223 J42 48 


3 

15 


4 

4 


432 £ e 

They are ; 


5 Total 

0 432 

T , [B. A. Mad ’54] 

ine mean m of the Poisson distributed is provided by 
Sfx 299 ^ _ 

Zj c= 432 == 692==0 ’ 7 a PP r °xiraately. 

The corresponding expected frequencies are calculated from 

~ m m_ -•* m 2 m* ~ m m 9 ] 

H* 2! * * II. e 5lJ 

(e-®’-=0-497) 

e : 214-5, 150-2, 52-5, 12*3, 2*1, 0-4 and 
o:223, 142, 48, 15, 4, 0 

frpmw»rf c0mb,ne the Iast three frequencies in c and o to make the 

application 8 ^ Thuf 30 10 *“ ° rder l ° “ lisfy the condi,ion for 

® * 2 3 & over 

214-5 J50-2 52 5 14*8 

223 1 12 4g 

Y2.-<1^-214 (19-14 8) 2 _ _ 

2i4 5 - 1 ^ 0 — ,D. F. v = 4—1-1=2 

=2-36 

data] The laSt d ' f ' SUbtrac,ed is - far calculating m from the 

Th , e 5 \ value of *• for 2 D. F. is 5-99. The |value of /.*, e . 

- 6 is less than the critical value. Hence the fit is good. 

14 3. lo test the independence of two attributes generally : 

Let any set of N individual; be classified according to* an 
a (tribute .V into groups X u X 2 . X r with frequencies /i*, .. 

where ^=rr i 4-^4-...+n r . Let the same set of N individuals’ be 
classihed according to another attribute >* into classes V v v 

* It • i j 


X 

e 

o 



139 


Test oj Significance Based on Chi-square Distribution 
with grouping frequencies m v m sy where, of course, 

N=mi+m 2 -\- 

How many have both the attributes ? 

Suppose now that the frequencies of individuals falling under 
both groups Xi and Yj be Ftj. Then we form the contingency 
table like our old correlation table. 


\ 


* 2 

1 

• • • 

Xi 

• • • 


Total 

y l 


F%\ 


Fix 


Frx 

m L 

‘ y 2 1 

Fx* 

F 2 2 


Ft \ 


1 | 
t TZ | 

m 2 

• 

• 

• 



| 





Yj 

Fu 

f 2 j 


F,j 


FrJ 

mj 

• 

• 

• 

I 


1 





| 

Y. 

F n 




1 



Total 

i 

"i 

«2 



! 

j 


N 


The hypothesis is of independence between the two attributes. 
Suppose we w int to test the hypothesis that the two attributes are 
independent. To apply the 7. 2 method to test the hypothesis, we 
have to find the frequencies that we expect from the hypothesis. 

Let E t j be the expected frequency of the number falling under 
both groups X» and Yj. 

Now n t out of N individuals fall under X,. Therefore the 
probability of an individual falling under X t is equal to nrfN. 
Similarly, the probability of falling under Yi is equal to mdN. If 
the attributes are independent, these two events arc md. pendent 
and therefore the probability of both happening, that is ol an 
individual falling under both X, a r .d Yj is the pioduei of ihe 





140 


Mathematical Statistics 


seperate probabilities and therefore equal to 

N 2 ’ 

The total number in (X, n ‘ mj — n ‘ m J 

V JJ N* N 


i. e. 


a 


non, 

N 


Hence V=2 2 (£"£'')L sa 

' ' h tj y 


l i 


cox V th f e P ro ^ abi,i *y of 7* exceeding the value c is greater than 
(or ,f c js less than the critical ofX 2 at 5% level) we may 

f A 1 - * A 1 • • ° The degreT of 

freedom with which we look up the table is (r- 1) (,_ 1 ) as in any 

row °n!y readings are independent; the sum being given the 

rth can be obtained by subtraction. 

Note. The above table is called rxr contingency table. 

Pearson’s coefficient of mean square contingency 

c - fcS*r-[ 4 J'— 

c can be reduced to a sample form as below. 

7 . 2 =£ £ ( < E LLZ: E ‘i) % _ 

( i ElJ 

=£2 {F-u~2 E„ F u + E-,,)/E,i 

s,nce - - £./=£ £ F,j=N 




~= 2 £ p ^-1 

Jy amt) 


F 2 ■ 1 

—~— 1 I Since F — n ‘ m ' 
"Mj 5,nce Ei > —] 7 “ 


l-ht/t 2 =V ** o 

l+ *~ ££ n7mr S ’ sa y 


*M¥f. 

Even with pe.fcct association C is not equal to one although 
it approach, s one as the number of rows and columns increases 

Ex. 1. Sfcw that in a 2x2 contingency table where in the 
frequencies are fa. x> calculated front independent frequencies is 


TeSt °f Significance based on Chi-square Distribution 


141 


72 — (a+b + c+d) (a d— hr\- 
(a + A) ( c+d ) (b+djja+c) * 

T , 0 t B ‘ A \ Hons Delhi ’59, ’61, ’66; M. Sc. Agra '56, *59] 
Ine 2x2 contingency table is 


\ 

Y\X 

\ 

*1 

*a 

Total 


1 1 

a 

b 

a-Yb 

yi 

1 

c i 

: 

d 

c-\- d 


' a-Yc 1 

1 

b-Yd 

N=a-Yb-Yc-Yd 


Let the corresponding expected cell frequencies be denoted by 
a\ b\ c \ d' where N= a '+b'+c'+d'. 

hence 


o'=^|±£L(£± 6) and _ad—bc^^ 

a-Vb-tc-\-d a+b + c+d 

and 4-4'- J* 

a-rb + c+d 0 + 6 +c+J 

e - = (£±fUc±dl , h c~ ad 

a+b+c+d and c - c ~ZTbT^d 
d' = (j±d) (c+d) ad-bc 

°+b+c+T and d ~ d -a+b+7Td 
Therefore, 

x*=fer?!? + (ti£0^ + (c-c')* .(d-d)* 

a »■ -r—r-±—— 

1 


r 1 , 

(« + 4+e+rf, Lto+c J ( a +t») + (J+*jXMr f (^T7)7?+5 ) 


+ 


(a -bey (h-\~d)(c-Yd) 

(u-hbt c-td) 

[ < ±± d) (c + d)->- / „ +c )( c+d)+(a+hXh + d) + (a+h )(a ^ c) 

L << u + bXa-Yc)(b-Yd)(c-Yd) - 

__ (ad— br) 2 J 

(a+b+c+d) 

\(c±d)(2±±±c+dyY(a+b) (a+b-Yc+d)! 

L (“+bXa+cXb + d)(c+d) J 


] 



142 


Mathematical Statistics 


_ (ad—bc) % (g-\-b+c+d ) 

(a+ZO(a + c)(6+-M; 

Ex. 2. Show that for a 2 xn tabic 

y* =z | N,N 9 (WW-py/AT,)}* ) 

( /*lr+l**r J 

w/jfre /x lr , ^ 2r are f/ie 2 frequencies in the rth column ond N u N a are 
the marginal sums of the 2 rows. 


(M. Sc. Agra ’53, B. A. Hon’s Delhi ’53] 

Let 2 xn table be as follows : 



Let M lr etc. denote the expected frequencies under the hypo- 
thesis of no association 


n'l — — 1 ^ ir + *** r ) f/x, r-bvir) 

hr K, " r • ^ r+ ~N l+ Nr 


hii H - hi 2 

A X 2 = ^ T (/Xtr — /zr)* ! 

r L /lr /ar J 

_ ^1 0* lr4-p2r) _N ? u, r —N,lL tr 

. u *.+*. ~ yVj+TVj 

and therefore, 

/ir 


_ iVi+AT* 

Wl-Ntf ' A^r + ^r) 

— (Mir N 2 ~Ho r N t ) z _ 

M M + AT* (/*, r + /i 2r ) 

Similarly, 

0* 11^/«r) a _ W 

/ar A^a(A r i + A' 2 )(/i lf .+ / i, f .) 

... **_27 . fcyNt-Kr hit ) 2 f 1 1 1 

r (A^| -h A'j) (/x*r + /i*r) [JV, "’"iVj J 

_ 2 ; 0*lr hlj — fitr Ni)* 1 
r 0*lr + >l:r) *AVVj" 

AT V (lhz~^* r V 

-27 l/V> W N- ) 

Now />=_/>, (X* ^ V) 


Tesf of Significanbe bused ou Cht square Dstribution J 43 

DlinJ ! 6 V h' U M° f P §iVe ? US ,he P robabilit y ‘hat on random sam- 
Pl ng we should get a value of X* as grest as, or greater than the 

imnr e „h C M a " y r btai r ed - N ° W > --all, oufdata givesu'san 
improbable value of x 2 . 

Ex. 3. Show that for entries in the 2xr contingency table 

A\ A 

B i a l 

B z b t 

Totals m 

the value of X 2 for testing homogenity is 

X 2 = 2 Wi (pi—p)* 

wheJe /?,=—. 
n, 

Deduce that 


A 2 

a 2 

bo. 


Ai 

a t 

.Ui 


A r 

O r 

k r 

M r 


Totals 
a 

b 

n 


"-f 


w t 


Oi b 

'pq' q== ~n* 


%2 ~pq [f a ‘Pi-°p\ 

t . .. [M A> Pa ‘ na ’ 58 » h s * 1 • cal - ’53; I. C. A. R. *52] 
Let the expected values of < 7 , and b { be denoted by a', and b\ 

Hence ^ and = M« 

\Ui n / 

Similarly (?( _ p)s 

A (6,-*',)* 1 

'l + -F, - ) 


=nf (p,-p)~. 


■f { p to P ) 2 +-^iqi—q) 2 j-, since ~=p 

= f {y to-p)‘+^-(p-p ( ). f 
= f^ to-p)«, since p+ ? =i 

w t ip,— p )* t where w l = r lL. # 


X 2 =i7 »v/ (p/-p)a 


S i n ‘ ^ Pl ^Pi'—lniPiP+Uip)* 

If X 

iq [“f a iPi—2p 2 n,p, -f- p* E W/ 



27 

A 

t 

I 

2 


1/ 

a l_ 

r 2 

pq 

L/ 

b _i_ 

r* 


L7 


iq Q iPi 2p 2 ai +pt n 1 

| r J 

^-[f “<P< ~2pa+p* .j-J, Since n 


a 

/> 





144 


Mathematical Statistics 


= i[f a ‘P‘-“p] 


P<J L * r ] Proved. 

Ex 4. In a locality 100 persons were randomly selected and 

asked about their educational achievements. The results are giveu as 
under : 

Education 



Middle 

High School 

College 

Totals 

Male 

10 

15 

25 

50 

Sex Femal 

25 

10 

15 

50 

Total 

35 

25 

40 

100 

Can you say that education depends on sex ? 

Y °u may use X 2 0 os = 5 99, XVoi=9 20 for 

v=2. 



fM. A. Delhi ’54] 

We assume that education does not depend on sex. On this 
assumption the corresponding expected frequencies are : 


Education 



Middle 

High School 

College 

Totals 

Male 

”X30 

25 v 5° 


SO 


ICO 

lou 2 5 

zu 

jyj 

Sex Famal 

17-5 

12-5 

20 

50 

Total 

35 

25 

40 

100 



=^9 92 approx. 


v = number of degrees of freedom = (2—1)(3— ]) = 2 
Thus the calculated value of x» is highly significant, and there 
for the hypothesis that sex and education are independent is dis¬ 
credited. 


Ex. From the following regarding the colour of eyes of 

fathers and sons, test if the colour of the son’s eyes is associated 

with that of fathers. 


Eye Colour of Sons 


Eye colour of Not light 

Fathers Not Light '230 

Light 151 

Total 381 


Given X’. 06 . = 3't4 for v = l. 


Light Total 

145 378 

471 (,22 

6*9 * 1000 

[B A. Hon’s Delhi ’65] 



rest of Significance Based on Chi square Distribution 

14-4. Formula of Brandt sod Snedecor. 

Suppose the 2xo table is given as follows : 


145 


\x 

y\ 

j x1 X Z 

• • • 

• 

Totals 

JL_| 

i j 

°i 1 a 2 

1 

"I 

** 

A 

/ 

* . 1 

i 




y a 


Totals 


bx 


’2 


<*2 


I 




• • • 


N 


Then we have to show that 


Y2 = _LJ " <** A 2 ) , _ J 

pq c * N r Where />= AT and <1 


- B 

N 


We assume that the attributes x and rare independent r„ 

corresponding ,o the ° bser ' 

a\= c, A b ’ = C <B 

n ’ 6 Cat 

Hence using the formula X» = ^ we get 

' ( CiA/N 

Now A*= (ct—ai)*=c t t —2c,ai +u ( * t 2 ? r 0 ,_^ 

' / 


•*. 


X*=-T AT \ 

' cjj - )- N 

* ] a i'(±,±\,N __2A) Af 
1 l c < \ A + B r~B B~\ N 


N 2 


A n B 

2 a lL+ N L-2NA mr 

AB i c, B ~b N 
~2 a jl+N{N—B) J2NA 

AB i c, b ~B~ 

XB S t ?-£. 

l_\2 a?___A 2 1 1 r , 

Wl t Cl N \pq\i a ‘Pi-Np* j, if =p ( 


J 46 


Mathematical Statistics 


or 


^rahri 2 : atP ‘~ (NTyT } 

°‘ p ‘~ a A 

Ex. Prove that for a 2 xr contingency table 


ai 


a. 


a , 


m 


n 


b x b 2 . b r 

where m and n are row totals, the valne of X* is 

[27 (o, pi)—mP]/PQ 

where P=~H L_ „ JLl _ PJ_n =I 

m+n' p * 1 

[B. A. Hod’s Delhi ’50J 

13- Yates s Correction for continuity io 2x2 Contingency 
tabie. 

Let us consider the table 


a 


The X 2 for this table has been calculated as 

*/ 3 ___ Njfld — be) 2 

' {a- + b)(c -f ci)( a 4- c)ljr+d) ’ 

where d. 


... 0 ) 


Yate s has suggested that the approximation to X 2 is improved 
y replacing one cell frequency, say d, by d±k according as 

totals unaltered 5 ' The 2 "* 1 adjustin 8 ,he °‘l>ers to keep the tnargt- 

.l j■ , . necessity for Yate’s correction arises since 

nll J lr? a contingency table is necessarily disconli- 

tion i ',2 aS dislributl °n is continuous. The approxima- 

nuous binn* the a PP rox| ir>ation of the disconti¬ 

nuous binomial distribution to a normal one, where instead of 

V(2n)J„ e dx, 

we actually take—!— f 4+l/ * —i* a , 

V(2») J 0 _ i;3 f dx. 

‘ he effcct 0f in,rodu,:i "S Yate's correction is that (b d -be) in (I) 
above is replaced by | aJ _ bc , _ (A , /2)i This . § know „ M Yate . s 





147 


Test of Significance Based on Chi square Distribution 

correction for continuity. It undoubtedly improves the estimate of 
significance for a 2x2 tables. It should always be applied unless 
the eel I frequencies rre 500 or more. 

Ex. Prove that X 2 for the table 


o + \ 

b-\ 

c~l 

d+\ 


X 2 = 


if ad < be. 


is given by 

N ( 1 ad-bc | —N/IY- 
(a+b){c-\-d){a+c){b+d) * 

13*6. Fishers Exact trst for 2x2 tables.. 

We consider the mathematical model in which there N balls 
in an urn, of which r x are marked A x and r t are marked A 2 . The 
balls are with drawn in random order and put into a row of N 
boxes, one ball to a box, of which c x are marked B t a np c a arc 
marked B 2 . We have the following 2x2 table. 


i 

I B* 

Totals 

A x 

a 

b 


A t 

c 

d 


Totals 

*1 


N 


. . The Probability of a a x 's and c A % 's in the c, boxes marked A 

«s given by the hypergeometrlc law, since this is a problem of 
sampling without replacement. It is 

( r1 )( ra )//' N )- r * ! 1 ! c * ! 

> a A c II \ c, / a\ b\c\d \ N\ 

in k thC d, f tr ‘ b “ tion in the boxes marked B x is fixed, that 
f V?" b ° xcs marked «s also determined, so that the probability 
the observed 2x2 table in vn hich we suppose that d is the sraa- 
•test frequency is 

/>'= - r « f r * 1 1 c* t 

u l b [ C \ d l J\ l 


...( 1 ) 



148 


Mathematical Statistics 


The total probability of the observed distribution in the same 
direction is given by 

P = P\+P\+P\+ ... +p' d . 

This is known as Fisher’s ‘Exact Test’, It consists of compu- 

ting from (1) for all values of d from upto o the observed value 

(if J < 8). The probability P corresponds to one tail of the 

distribution, and thus is comparable with half the probability 

calculated from X 2 since the latter corresponds to both tails of the 

distribution. If d > 3, the total probability in one side is 
given by 


=P\*+P 




d ,y d+\ • - 

The chief objection to Fisher’s exact test is the large amount 
of computation involved when the ceil frequencies are large. 

Culosis the fnUn experiment on immunization of cattle from tuber¬ 
culosis the follow ing resu i ts Qre obtained . 

Inoculated : ' Affe ‘ ted ^affected 

Mot inoculated : l0 j 

. . Examine tlie effect of vaccine in controlling sustepttbility to 

1icancp° 3lS j / ° tS Yate s correc ti° n improve the estimate of signi- 
table" ge " by COmiderin & "‘e Fisherexact test for 2x2 

From the above table : 

/a—_ (od - be) 2 N 

{a + c) (c+a)’ 

yields X*= r -iiL 22 l_ 

H^^:;£Hrr , - whichisabou ‘ 3 ^ 

in controlling susceptibilitv , K fiCan ! 3nd the effecl of vacc,nc 

Annlvmt v , P b y ,uber culosis is not discredited. 
Applying Yates correction 

y2== 34 2 x20 

r - which is abou ' ,0% -- 

that th* voo ^ * 1 s, S n ificant, and the hypothesis 

we Ibseive thrth 065 n0t COntro1 tubercu ^sis is acceptable Thus 

a non fi C COrreci,on cban £es a significant probability to 
a non significant one, on 5% le\el 

The probability of the observed distribution is 
3 15! JU ! i. J 20T~0 0477, 



14* 


rest of Significance Based on Chi-square Distribution 

The probability when ,„n. 8! 12 ! 13 ! 7 ! 

,S Hll !6!2!2U! =OQ043 
The probability when ,2=0 is ' 8 ! 12 1 13 ! 7 ! 
m P . . , ! 12 ! 7 ! i ! 20 !~ 00001 

“ ” b, ”“ v *'“ ~te, 

14 7. Some other tests based on X 2 . 

(a) Test for variance of a normal population. 

We have to test whether the sample 

Sas been drawn fl’anoraal population with variance ... 

The statistic £ ~~ is distributed like ^ with n— 1 D . F . 
If the calculated value of X 2 is less than -/2 r 

da, Vw * C0DS f red --‘-n. withTbe^o^r 1 d - f - ,he 

If wc know that P _ r. , _ 

interval of the variance of a normaT'popula'tion from" the^^'T 

can be found as below. the sam P*e 


• vywutJ db UCIOW. 

p ia<X*<b)=j b 

’ / v/ 

/*( <A j 


a 


(i 


S(jc, 


_<J_\_ 

/-*)* o J- 


P (*<— 3 c ) 2 


-g- <<t 2 <— ( ^-*>lj =qc 

the length of the confidence interval of is 

* (t-t) 


(6) Combination of Probabilities from 2x2 tables. 

estimate thn^all^nfficab^onh"^'^ 6 "' 5 ’ " ' S desircd t0 
3 m,:,hod of combining probabilities in sucVcase. 1 " 181 ’" haS ^ 

Let dp=/ ix)dxs> P(x<Xi)a I* 1 

' — CO 


J (x)dx 



150 


Mathematical Statistics 


^ the range of P is (0, 1). If P is a random variable 

depending on the variable Xi with frequency function g (P), we 
have 


g (P) dP=f( Xl ) dx x =dP 

^ g (P)= 1 => P has a rectangular distribution in the range 
0 to 1. 

Ifw=-2 1og,P, then ~ = -~ 


du =— du=\ e~ u i* du , w>0 

=* frequency function of u, say h ( u)—\ e~ u < 2 
=► u has X 2 distribution with 2 d. f. 

If now we combine k independent probabilities, the combined 
probability is the product of the k seperate probabilities, or 



k 

«=—2 log, P= —2 log, (P 1# P 2 ...P*)—— 2 27 log. P,= 27 

I i— 1 

where u, has X 2 distribution with 2 d. f. for every i 

Hence by the additive property of X* distribution, u has X 2 
distribution with 2k d. f. 


Let us note for calculation purposes that 
-2 log, (/>, -P*)=log, {jT p^ -y k )' 

= 2 ’ 3026 (prphr.y 

In a 2x2 table the distribution of P is not really rectangular, 
but for moderately large frequencies, the approximation is satis¬ 
factory. 

Ex. 1 Four tests of significance have yielded probabilities 
'145, '263, '087 and '075. Test whether the aggregate of these four 
tests should be regarded as significant. [M. A. Bom. *52} 

Ex. 2. Show that for 2 d. f. the probability P of a value of 
X* greater than Xy 2 is e “ and hence that 

X„ 3 =2 log, Deduce the value of Xo* when P=005 

[B. Sc (Raj ) *54, B. Sc. Bom. *50, 
B A Hod’s Delhi *56, *59] 


(Hint. P= P (X 2 >X 0 2 )=s| 

Jy 1 


00 ~ i ** , va 

e - d i~—e 

Xy 





Test of Significance Based on Chi-square Distribution 


151 


=*■ V=2 log. 



When P=»0 05 


V=log, 20] 

(c) Testing homogeneity of Several estimates of the population 
variance 

Bartlett Test : Suppose in k independent samples the /'th 

sample gives an estimate Jr of the population variance with d. f. 

Pi. The question is; Are these estimates consistent with the 

hypothesis that the samples have been drawn from the same 

population ? In other words, are the estimates homogeneous ? 

We assume that the samples have been drawn from the same 

population. An unbiased estimate s* of the variance of the 
population is 


•* 2 —y* 27 v / j, 2 .(v=27v < ), where obviously £ , (s*)=o 2 . 
Bartlett has shown that the statistic 

— ( y 1o &j s 2 —27 v< log c j< 2 ), where c=l -f ^- (z I-J_ 

has X* distribution with k— I d. f. 

The value of X- calculated from the data will tell whether the 
hypothesis of homogeneity is reasonable. 

If we have judged by the F test that the smallest and largest 

variance in a set do not differ significantly, then it is useless to 
employ the Bartlett test. 

Note. If N (n, a 2 ) denotes the usual ^normal population, the 
following tests of significance are available to test the various hypo- 
thesis mentioned : 


(0 If P = \ i o and a is supposed to be known. 

The statistic which is a TV (0, 1) variate will serve as a 

test of significance. 


(.H) If l l = p o and a is supposed to be unknown. 


The statistic (x *‘° ) v '” where -lj Z ( x,-x )- has 

( distribution with («—1; d. f. and will serve as a test of signifi¬ 
cance. 

(Hi) //<r=o„ and /x is known. 


fy _ \ J 

The statistic 27 - 1 J* — has X 2 distribution with n d. f. 

i -= 1 

and will serve as a test of significance. 



152 


Mathematical Statistics 


(/v) If a=c 0 and p is unknown . 

The statistic 27 —■ has X 2 distribution with («— 1) d. f. 

a o 

and will serve as a test of significance. 


(v) If M,=/i a in two normal populations N (p lt cj 2 ), N (^, a* 2 ), 
where and fi 2 are unknown but oj 2 and <y a 2 ore known . 

_ x-y 


The statistic 


and will serve as a test of significance. 


is a standard normal variate 


Exercises 

1. Five dice were thrown 192 times and the number of tiroes 
4, 5 or 6 were as follows : 

No. of dice throwing 4, 5, 6:5 4 3 2 1 0 

Observed frequency : 6 46 70 48 20 2 

Calculate X 2 . [B. Sc. Agra *45] 

[Ans. X 2 = 16*61] 

2. Five dice were thrown 96 times and the numbers 4, 5 or 6 
were thrown as given below : 


No. of dice throwing 5 4 3 

4, 5 or 6 

/ 7 19 35 

Calculate X 2 . 

[Ans. X 2 = 8-3] 


2 1 0 
n 8 3 

[M. Sc. Agra ’46] 


3. Find tne value of X* for the following data : 


Class 

1 A [ B 

C 

t D 

f 

E 

fo 

8 

29 

14 

15 

4 

fc 1 

1 

7 

1 24 

1 1 

38 

24 

1 

7 


[M. Sc. Agra '51, ’55] 

[Ans. 6*75] 

4. 12 dice were thrown 26,306 times observing at each throw 

the number of dice recording a 5 or a 6. Obtain X 2 


k 


Test °f Significance Based on Chi-square Dirtribut ion 


153 


No of 
Successes ^ 

0 

1 1 

1 

2 i 3 

1 

4 

Is 

1 

6 

7 | 8 1 9 

1 I 

10 and 
over 

Observed 

frequency 

185 

1149 

3265 

5475 

6114 

519-J. 

3067 j 

1331 ( 403 1 105 

1 

18 


No of girls 
No. of families 


0 

5 

12 


[Ads. 35 941] [M. Sc. Agra *44, *59, U P. C. S. ’54] 

♦k * 5 ‘ Th f J f0,,OWirig tab!e gives the "umber of aircraft accidents 
that occured during the various days of week. Find whether the 

accidents are uniformly distributed over the week. 

Day_s. Mon. Tue. Wed. Thur. Fri. Sat. Total 

No. of accidents 14 16 12 11 9 14 84 

[I. C. A. R. Delhi *51] 

6 . A survey of 320 families with 5 children each revealed 
the following distribution : 

No. of boys : 5 4 3 2 1 

0 12 3 4 

14 56 110 88 40 

J n * h,s result consistent with the hypothesis that male and 
icmale births are equally probable ? [B . Sc. Mysore *67] 

(^*Vo 6 = ll 07 for 5 d. f.] 

accnrH The Rowing is the distribution of 100 eight pig-litters 
according to the number of males in the litter : 

No. of males : 1 2 3 4 5 6 7 8 Total 

No. of litters : 5 8 22 23 25 12 1 4 100 

rations'r. B | in 0 r lal d ‘ >tribu,ion under the hypothesis that the sex 
' “ ‘ Test the gooness of fit. Given that for 4 d f 

at 5/level of significance^ 488. [11 Sc Agra’ 57 ] 

8 . In experiments on pea breed,ng. Mendel obtained the 
following frequencies of seeds : 315 round and vellow. 101 wrink- 

To,a a , n 55 6 >e Tr 108 7 "“ ^ Er “"' 32 «ink.«d and g^en. 

nronlr.Ln I .Tf "'t ,ha, . the f ' c ^encies should be in the 

Iheorv a ’ ’ ' *' Examine >he conespondence between 

theory and experiment. [B. Se. Agra '63; B Sc Delhi ’67; 

B. >c. Kanpur ’69] 

3 times T ?k tCSt a hyP ° lhesis ^an experiment is performed 
of whkh The suiting values of 7* a.e 2*37, 1*86 and 3*54, each 

riected at 5 V r", '° °"' f ' Sh ° w ,hat wh ! le cannot be 
ejected at $/„ level on the basis of any individual experiment it 




154 


Mathematical Statistics 


can be rejected when the three experiments are collectively 
counted. [B. Sc. P° ona 68] 

10. Prove that for a sex contingency table X 2 =N (r—1) or 
X 2 =N (5—1) whichever is less. ]B. Sc. Raj. ’51] 


[Hint. X 2 =2 27 F ~ L ~N 

t j En 



= N 2 2 F ^-~ 

N. /= 1, 

2,.m, 

t ) n t mj 



Now Fi j<tij and Fu^mj 



Hence X-=N \ 2 2 — 

• — — 1 

laf 22 — — 



J i i J H, 

II 

1* 

l_ 

] = yV (r- 

-1) 

L / 


Similarly, (*— 1). Hence the required result] 

11. In an experiment on immunization of cattle from tuber¬ 
culosis the following results are obtained : 

Affected Unaffected 

Inoculated 12 26 

Not inoculated 16 6 

Examine the effect of vaccine in controlling susceptibility to 
tuberculosis. [I. A. S. *48] 

12. Given X contingency table representing two indepen¬ 
dent samples 



l'i 


• • • • l l r 

Total 

ni 


u 

v 2 . 

. ...V r 

rt 

Total 

Pi + t't 

I l 2~h V 2 - • • 

•••pr + l 'r 

m+n 


Show that 

---\ I 2 — w?co| 

«» •• ~ I , = j 1 

. p, . tn 

where w,-= — — and co-- 

/Ji + Vi m-\-n 

C an be used to test whether the samples are drawn from the 
same population. Clearly state the underlying assumptions, and 
give the number of d. f. (B Sc. Lucknow ’64] 

[Hint . See § 12 4. This is Brandt anl Suedecor formula] 
13. Genetic theory states that children having one parent of 






155 


Test of Signifcance Based on Ghi-square Distribution 

blood type M and the other blood type N will always be one of 
the three types M , MN t N and that the proportions of three 
types will on average be 1 : 2 : 1. A report states that out of 
300 children having one M parent and one N parent, 30% were 
found to be type Af, 4d% type MN and remainder type A. Test 
the hypothesis by X 2 test. [U. P. P. C. S. ’66] 

[X 2 o 06=5 991 for v=2] 

14. The normal rate of infection for a certain disease in 
cattle is known to be 50%. In an experiment with seven animals 
injected with a new vaccine, it was found that none of the animals 
caught infection. Can the evidence be regarded as conclusive at 
1% level of significance to prove the value of the new vaccine ? 

[I. A. S. ’49] 

15. Four coins were tossed at a time and this operation is 
repeated 160 limes. It is found that 4 heads occur 6 times. 3 heads 
occur 43 times, 2 heads occur 69 times, one head occurs 34 times. 
Discuss whether the coins may be regarded as unbiased. 

[I. S. I. Cal. '52] 

16. The following data shows suicides of women in 8 German 
states during 14 years : 

No. of suicides inaOl 234567 Total 
state per year 

Observed frequency 364 376 218 89 33 13 2 1 1096 

Fit a Poisson distribution to the data and show that the fit is 
not satisfactory. 

17. An operation research group studying the libraries of a 
certain university found that the number (a) of books withdrawn 
by individual users during any one visit to the science library was 
distributed with probability function 

/( A ) = (0 4) (0 6)*, jr —0, I, 2. 

Test this hypothesis for goodness of fit using the following 
data, which represents an independent random sample of 100 
obsenations. 

Value of a : 0 1 2 3-1 or more 

Frequency: 37 25 17 5 I) 

(M A. Delhi V>| 

18. 1582 drivers were asked how many accidents they had 
caused. 951 had caused no accident, 501 had caused one accident 
each, 100 had caused two accidents each and 30 had caused three 



156 


Mathematical Statistics 


accidents each. Which distribution is appropriate for these data 
and why ? Fit the distribution and test the goodness of fit. 

(Given \/e=l££i). [B. Sc. Agra ’68] 

19. Test for independence between health and working 
capacity from the following data : 




Health 



very good 

Good Four 

Working 1 

Good 

20 

25 15 

capacity J 

Bad 

10 

15 15 

[B. Sc. Bom. *67] 


20. The wages of 1,000 employees range from 43*6 d. to 
190 6 d. They are grouped in 15 classes with a common class 
interval of 18, and the class frequencies, from the lowest class to 
the highest, are 

6, 17, 35, 48, 65, 90, 131, 173, 155, 117^75, 52, 21, 9, 6. 

Tabulate the data, and show that the mean wage is 12*006 s., 
and S. D. 2‘6265s. 

Can the wages of 1,000 employees be regarded as a random 
sample from a normal population ? 

[Hint. The class frequencies per thousand of the normal 
distribution are approximately 


6 7, 11*3, 25*0,48*0, 79-0, 113*1, 140*5, 151*0, 140*8, 113 7, 


79 5, 48 5, 25 3, 11*0 and t>*/. 


Our hypothesis is that the population is normal. Since the 
sample is laige, its mean and \ariance aie taken as estimates of 
those of the population. On account of the smallness of the 
extreme frequencies v\e combine the first two classes, and also the 
last two, leaving 13 classes. Since the sum of the class frequencies 
is consiant, and two of the parameters are estimated from 
the sample, the number of d. t. is 10. The value of X s from the 
sample is 



5‘ 


It- 


14 2 


iY+25 +° + 7 o +•** + 


(3 2)» 
18 


19 84, v=10. 


Fiom the table18 31 for 

005 


10 . 


Hence the calculated 


value of X- is significant at the 5% level. We conclude that the 
assumption of a normal population is probably incorrect.! 



Test of Significance Based on Cht-square Distribution 


157 


21. In 1,000 extensive sets of trials for an event of small 
probability, the frequencies fi of the numbers x, of successes pro¬ 
ved to be. 

x: 01234 567 

/ : 305 365 210 b 0 28 9 2 1 

Show that the mean number of successes is 1*2. May the 
data be regarded as those of a random sample from a Poissonian 
distribution ? 

[Hint. m=l*2. The frequencies of the Poissonian distri¬ 
bution, with the same mean and the same total frequency, are 
approximately 

301*2, 361*4, 216-8, 86-7, 26-0, 6-2, 1 2, 0 2. 

To apply the 7* test we combine the last three classes. Then 

- /2 _(W (3-6)2 (44)2 

* 301-2^361 4 _h, " + 3 

Here the number of classes is 6; but the total frequency is 
constant, and m was estimated from the sample, so that v = 4. 

From the table P (7. 2 >35) for v = 4 is 0 50. 

The value is therefore not at all significant, and the assump¬ 
tion of a Poissonian population is not discredited. 

22. From the adult male populations of seven large cities, 
random samples of the sizes indicated below were taken, and the 
numbers ot married and single men recorded. Do the data 
indicate any significant variation among the cities in the tendency 
of men to marry ? 

B C D E F G Total 

164 155 106 153 123 146 980 

57 A0 37 55 33 36 294 

221 195 143 208 156 182 1 274 

[Hint. We test the hypothesis that there is no significant 
variation in the tendency mentioned. The ratio between the 

column of totals is ^5. =15 and this ratio should hold in all 


City A 

Married 133 

Single 36 

Total 169 


cities amongst 
Jencies are : 

married 

men 

and single. 

The 

theoretical 

City A 

B 

C 

D 

E 

F 

G 

Total 

Married 120 

170 

150 

110 

160 

120 

1 0 

V 80 

Single 39 

51 

45 

33 

48 

36 

42 

294 

Total 169 

221 

195 

143 

208 

156 

182 

1274 


158 


Mathematical Statistics 


From these figures 

H33- 30)" . (164-170)!! (33-36)» (3S-42)* 

* =-i3o-^-170 '•••"> 36 + 42 

• =5-34. 

To find the number of freedom we observe that the sum of 
the frequencies of married and single men from any city is cons¬ 
tant, being equal to the size of the sample from that city. This 
reduces the number of independent frequencies to 7. And further, 
a parameter of the population was estimated from the sample, 
namely, the value of the numbers of married and single men. 
Consequently v = 7—1 = 6. For v=6, P (X 9 >5*34)=0*50. The 
value is therefore not significant, and the test furnishes no evidence 

against the hypothesis.] 



15 

ESTIMATION 

15 1. Introduction. The theory of estimation was founded by Prof. 
R* A. Fisher in a series of fundamental papers. In the following 
sections! we wish to give an account of the main ideas introduced 
by Fisher, completing his results on certain points. 

Now we assume that we are given a sample of observations 
Xl * x n from a population, whose distribution has a known 

mathematical form, but involves a certain number of unknown 
parameters. There exists always an infinite number of functions 
of the sample values that might be purposed as estimates of the 
parameters. Now we have the following questions in our mind : 

(/) How should we best use the data to form estimates ? and 
(2) What do we mean by 4 best ’ estimates ? 


The best estimate is the estimate following nearest to the 
true value of the parameter to be estimated. It is important to 
ear in mind that each estimate is a function of the sample values 
an is therefore to be considered as an observed value of a certain 
random variable. Note that the observations are random varia¬ 
bles and any function of the observations will also be a random 
variable. A function of the obseevations alone is called a statistic. 
Hence wc have means to predict the individual values assumed by 
the estimate in a given particular case. This implies that the 
goodness of an estimate cannot be judged from ind.v.dual values 
but only from the distribution of the values, which it will assume 

1° * FU l* 1 6 ‘ ^ r ° m l * ie sa *npliug distribution. There exists 

a probability that the estimate will on differ from the true value 
by a small quantity provided that the great bulk of the mass in 

his distribution is concentrated in some snail neighbourhood of 

.he true value. Hence an estimate will be better in the same 

XT ; aS *'s sampling distiibution shows a great concentration 

ctZ'mt n ; The questions arises; Jim 

“ “ " °\ er '° f incl es, ‘ r nates of maximum concentration s This 
question is the starting point of our investigation. 



160 


Mathematical Statistics 


The estimating fun :tions are referred to as estimators . 

Characteristics of Estimates. According to R. A. Fisher, the 
followi ng are some of the criteria which should be satisfied by a 
good estimator. 

(a) Consistency. 

(b) Unbiasedness. 

(c) Efficiency. 

An estimator is said be the best if it is (i) consistent, (ii) effi¬ 
cient, (iii) sufficient. This is Fisher’s criteria for the best estimator. 

15 2. Consistency. 

Definition. An estimator t n computed from a random sample 
of n values i.e. t n — 0 (.Yi, x it ..., *„) is a consistent estimator of the 
0 if. for arbitrarily positive small numbers e and there exists a 
posstive integer n n (e, r t ) such that n > n 0 implies that the proba¬ 
bility that 

I t n -6 | < c 

is greater than 1—In the notation of the theory of probability 

«>/;„=> P { | t^—6 | < e} > 1 
where n 0 is some very large value of n. 

In some what less rigourous sense, the reformulation of the 
result is as follows : 

n-> cc => P { | t n -9 | < € } -> I 
The definition bears an obvious analogy to the definition of 
convergence in the mathematical sense. Given any fixed small 
quantity e, it is always poseible to determine a large sample 
number such that for all samples over that size the probability 
that t„ differs from the true value by more than X, is'as near zero 
as we please. Then t u Is said to converge in prohabil ty , or to con¬ 
verge stochastically to 0. Hence t„ is a consistent estimator of 9 if 

it converges to 0 in probability. 

Theorem I. The mean of a random sample is a consistent 

estimator of the mean of a normal pooula ion. 

Proof Assume that x x , x 2 , ..., x„ is a random sample drawn 
from the population N (p, a-) The statistic 

o/\/n 

is a standardized normal variate. Define estimator t for the parent 
mean 0 as follows : 

t = (l/n) £ xj 



Estimation 


161 


The distribut'd! of t is then 


dF “7ik) exp {~t (, - # > 2 } * 

The shows that t is distributed normally about 9 with variance 
(!/«)» and hence (t—0) is distributed normally about zero with 
unit variance. Then given c > 0; 

P{ | {t-6) I < * Vn) 

is the value of the normal integral between limits ±e \/n. That is. 

P < 1 ('- W) ' /n I < 7^) C ~‘ V2dl 

Let an arbitrary number V be given. Then it is always posible 
to find n sufficiently large for P{ I (f — 6) y/n | ^ ey/n) to be grea¬ 
ter than 1— v, and it will continue to be so for any large n. Then 
a positive integer w 0 can be determined such that 

n > n 0 and P { 1 (i t-6) x /n | < e (1 /«)} > 1 -*1 
holds. The proof is complete. 

Theorem 2. A statistic, satisfying the condition : 

ti >co => E {r n )=6 n -*-6 


and V(t n ) -* 0, is consistant. 

Proof. To prove this fact, we first wish to establish a lemma 
due to T scheby scheft' ‘ Assume that x is a stochastic variable with 
the property that £(*)=/* and V(x)=o~, Then 

P{ I x-p 1 > 5} < o 2 /* 2 , 

w hre 8 is any assigned quantity 

To prove the lemma, assume that f(x) denotes the probability 
density function of x. Then 


a 2 =f (x—fxYfix) dx 

J - OO 

(x-P) z f(x)dx+f^ s {x-iiYf{x) dx 


S 2 


> a 2 


OO 


[f^ /WrfY+ ]/‘+s /(x)rf " 

f<*-« 

J - o< 


] 


/ (x) dx 


> 8 2 P { I x—y. | > 8} 

Hence P{ | x— n | > 8} < o 2 /8‘, 
completing the proof of the lemma. 

To prove the therem set | t n —9 n 1 < 8. 

Then | t n -0 | = | 1 < I U-0 a I 4- | 1 

<8+ | 9-6 a | . 



162 


Mathematical Statistics 


Then P { | t n -0 1 < S-h | 6-0 n | } 

> P{ I t a -e n | < 5} > 1-5^- 

O 

by the lemma. By hypothesis, n-*oc => d n d and V(t„)-+ 0. Then 
there exists a positive integer /; 0 such that for each n > « 0 , 

I 0—Q,i I < S, and P(t„) < 8 2 «. 

Evidently 1 t„-0 I < 5+ | 6 \ 6 U | provides : 

I t n -d | < 8+5^ 

Consequently, n > n 0 => P{ J t n —d | < 5 + 8,} 

> P{ l t n —9 | < 8+ | e—e n j }> i -« 

| because < 8-e/&=e J 

where 6 and 8, are arbitrary small positive quantities- This estab¬ 
lishes that the statistic t n is consistent. (The stochastic couvergence 

is uniform provided that the mathematical convergence of E(t m ) 
and V(t ,.) is uniform). 

Deduction. The sample variance in random sampling from 
norma! population is a consistent estimator 

Proof. Recalling the sampling distribution of j 2 , we have 



1 

- c *P-( 

ns*\ ,ns 2 \\(n — 3) n 

2^ run- n 

~2 V 

o < 

S- < OC. 



Then 





E -nr 1 — 

/+ («— 1 

>&) 

J (n- 1) 


*jo J2exp - 

i-rr 

\ 2o“> 

) ds‘. 

which 

simplifies to 

n 

a 2 . Then n-voc => E(s 2 ) -r a“. 


ds 2 


As a matter of fact. 


-[E(s-)]- 



*2 (n— 1) cv 4 /n u 

Now n -*■ oc => F(J 2 ) 0. The theorem 2 states that s* is a 

consistent estimator of a 2 . This completes the proof. 

Dj. The sample median is a consistent estimator for the popu¬ 
lation mean of a normal distribution. 

In fact, for large n the distribution of the median tendo to 
the normal form : 



Estimation 


163 


dt' (a) a exp. {—2 n f x * (x— fl) 2 } dx 

where/i is the population median ordinate which is equal to 
(2 tio*) - 1/3 . Than the variance of the sample median is therefore 
equal to Tra 2 /(2n) and tends to zero for large n. 

Otherwise it may be dealt in the following form : Assume that 
x 2 , * 2 , x n is a random sample drawn from a normal population 
N{y. t a 2 ). Let M e be the median of the sample. Then 


>/ (2«) 


Zt= _ 

tr\/x/'s/(2n) o\Jt v 

is the standardized normal variate of the median. Then for given 

« > 0 . 


w 

- 

2*0 J c 


exp- ( — iz 1 ) dz. 


P { | Me-y. | < c}=P J_ M-P I V(2n)^eV(ln) ^ 

=/>! I z\ < < z < c v^l!1 

( a V n ) { Gv/tt) ' oyit 1 

<\A'2w) 

~V(b f eXP ~ (_4Z,) ^ 

OyJ tt 

It is possible to slelect a positive integer n Q such that for each 
n > n o the area under the normal curve between the limits 
±{«\/{2 n)}/(a\/7T) becomes greater than 1 —r ; or equal to unity. 
This proves our assertion. 

Illustrative Examples- 

Ex. 1 For the Cauchy distribution 

l,F «=;r T+i i-w ' -® < * < «• 

the sample mean is not a consistent estimator of 6, but the sample 
median is 

For, assume that t is the mean-statistic to estimate 0. Then 
the distiibution of t is given by 

JF< ' )= f •{! + *_«,.}• -«< ' < * 

In this case the distribution of t is the same as Ibat of any 
single observation of the sample. Then if z = r—0, 


n 11-01 <«}-/>{ izi < £ }=-Lp ——* 

■-(2/7r) tan _I €, which does not contain n at all. Hence it is 
rather impossible to find a positive integer n 0 such that for each 
n > n 0 , P{ | 2 | < € } can not be greater than 1 —tq. Tnis proves 
that the sample mean is not a consistent estimator of 0> 



164 


Mathematical Statistics 


Suppose that t is the sample median of the Candy distribu¬ 
tion. The median ordinate is (1/^r) and sample variance of the 
median is given by 


v ( 0 = 


4n 


Reason. For large n, the distribution of the median is 
asymptotically normal and is given by 

dP (/) oc exp. {— 2nf 1 * ( t—0 ) 2 ) dt 
where f x is the median ordinate of the parant population. Now 


dF cc exp, i- -<!=?* 1 
F l J/2"A a J 

i.e. dF oc exp.J —- ^ f ~ ^ ~l 

l J/Wi* \ 


Not# that 

7T 


aD d y^.)=~r 

= 1 

4« (I/*) 2 4/r 

Letting n oc => F(A/,)-»0. 

Then the theorem (2) states that the sample median t is a con¬ 
sistent estimate for the population mean 9. 

Ex. 2. A statististic t n , whose value differs from the true v olue 9 
hv terms of order n~ l , whose variance V„ is of order n~ x and which 
tend'! to normality as n increases , is consistent- 

Evidently (/„ — #)/( V n hi ) tends jo zero in probability and then 
t n is consistent. The result still holds even if the limiting distri¬ 
bution of r n is uu specified. As a matter of fact, let 

E(tu) = 9+k n , V(t n )=V n 


where 


V n =0 


Lim ^ Lim 
m->co n n~+cc 

Then by theorem 2, P{ \ t n -(0+k n ) | < e} ^ 1 -(FJ**) -> 1 

as n-> cc, and hence P{ \ t n —fl \ < e ) > 1 —-rj for each n > n 0 

where n 9 is a positive integer. In conclusion. t n is consistent. 
Unbiasedness. 


Definition. The statistic / = / (x 2 , .. x„) is said to be 

unbiased estimate (or estimator ) of the parametor 9 if £(t) = 9. 

|Thc estimator is positively biased if £*(/) > e, is negative biased 


if £(/) > 9 and is asymptotically unbiased if 

Lim 


n 


E (r»)=e 



Estimation 


165 


Illustrative Examples. 

Ex. 1. The mean statistic is a unbiased estimator of the parent 

mean provided that the latter exists. but the sample variance is not an 
unbiased estimator of the parent variance. 

We know that 

E {x)*=*E 

-[^(x,) -f £(* 2 )-K.. + E(x n ) j 

^~n •••+>* times)J=/A 

This implies that * is en unbiased estimator of>. 

By definition, 


=£ [4,f- (*-/*)=] 

l n 

V E E {x—v-f—E ix-i^Y 

n ^-l J ' 


1 n 

= jr 2 o*—a*- 

n > Dl x 


= a*-°- = 


n—\ 


o- 


n n 

This shows that the sample variance s 2 is not an 
estimate of the parents variance a 2 . 

Note that ~ s * is an unbiased estimator of 


unbiased 


F " ‘[<£n]-ibi 


n 

E (x-x)* 
because —_ «2 — 1 J _ 



1C6 


Mathematical Statistics 


This is why some authors define the sample variance to be 


n 


2 (*-*)=/(*-1) 

)= 1 j 

Ex. 2. x jt = /, 2,...n t are random observations on a Bernoulli 
variable x taking the value I with probability 0 and the value 0 with 

probability 1 — 6. Then t -~—is an unbiased estimate of 6* 

n(n-l) 

where 

**=*i+*aHb. 

Our hypothesis speaks that xj assumes only two values 1 and 

0 with respective probability 0 and 1 — 0. Then 

E (*,)-1-0+0 (l-0)=0, j~l, 2,..., n 

E (x,*)=l*.0+O 2 -(l ~0)=6, j= l, 2,..., n 

Henci V ( xj) = E ( xf)—{E (xj)} 2 =d—e* = d (1—0) 

n n n 

Now E (/)==£■ ( 27 Xj)= 27 E (x,)= z 6 

j=l j = l j= 1 

= n0 

and v (r) = V (xj+.Tg-f-••• + ■*■„) 

= (*,) + ... +F(.r n ), 

noting that the covariance terms vanish because of indepen¬ 
dence of X] (J=l, 2,...n). 

n n 

Now V (/)= 27 V (xj)= Z 0 ([-e) = nd (1-0). 

j=l j=l 

Hence E (r")= V (/) + {E (r)}= 

=n0 (I— 6)+n-i? 

Iafacl - 


1 


n (n— 1) 
I 


n {n6 (\-0) + n :: 0 2 -n&) 


~—p --r- {nd— nd i -\-n 2 6 2 — n6\ 

n(n-l) 

<«■ -"))=«' 

This is wnat we wished to show. 

Fx. 3. If /, and t . are two unbiase d estimates of a parameter 
n ith variance r^ 2 and <r. 2 and the correlation ?, what is the best 




Estimation 


167 


unbiased linear combination of t x and t it and what is the variance of 
such a compound ? 

Define t as the linear combination of t t and f, such that 

t=liti-\-l 2 t% 

where E(t l )=E (/ 2 ) = 0. 

Evidently E (t)=E (/ x r a +/,r 1 )*/ 1 E (0+/, £ (r a ) 

subject to the condition that l t +U=l. 

Now V (/)=/,* ^ (/0+4* F (/ a ) + 2/ 1 / a Cov (r 1# /,) 

a =^i t <Ji*4/s 2 ®a t +2 /j/a P ...(1) 

We wish to minimize (1) with respect to l x and /,. Then we 

have 

W (tA 


di 


2 / J pa 1 a 0 = O 


a pro „ , 

~~j -= 2 / 3 (7j 2 + 2 a = 0 

Now /io, 2 44fajC7 a =/ 2 02 a + /iPa 1 CT { , which yields 

/l C^! 3 —pCT 1 a 2 ) = / 1 (<T 2 2 —pOiCTj) 

from which we find 

— L _ u /,+/* 


...( 2 ) 




...(4) 


1 


° 2 Pa, ° 2 CTx 2 —fOjGo o 1 2 -fa 3 2 — 2 ptr 1 a a cy 1 a + o 1 sl — 2 pg 1 g 3 

Then /, = -—, _ gf 2 —pg t ^ a 

ipriaf * ai 2 + a a 2 -2pa,a 2 * 

Therefore 


/= - q 2 8 -pqiq 2 , . _ g t a ~pg |ga_ 

a i“4g 8 " 2pg A g 2 2 a, 2 -{-(To 2 — 2p<7,g 2 * 

which is the requested best unbiassed linear combination of t, 
and /, 

Clearly 


(o^+gj*—2poxoj 8 ^ 12 ( <722 ' _PCT » a 2) 2 + a 2 2 (<V-P<7,a.) 


+ 2pCTia 2 (c, 2 - p<T,a 2 ) (<t 2 2 — pgjtfg) 

(g a 2 -pa,) 2 -4-g 2 2 a, 2 (a,-pg,1 2 -f-2pg 1 2 a 2 2 fo-pq, ) (gg-prr,) 

( a i* + ga* — 2pg l a 2 ) a 

__ a 

*(g, 2 -ho,*—2pg 4 g 2 ) 2 pff +fax — P ff i) 2 + 2 ? (<Xi-pg*)(g* — pa,)} 

Deletion. Assume that /, in the preceedinf* example is an 
unbiased minimum variance estimate and t 2 any other unbiased esti¬ 
mate with variance afe where V (!,)<=* 2 . Then the correlation 
between t t rnd t 2 is y/e. 



168 


Mathematical Statistics 


(4) provides : 

/1 (a* 2 — pajaa)=/ a ( 0l 2 — pa,a a ). 

Replacing oi by g and by a*/e leads to obtain 

‘ ( 

which simplifies to 

l i («—P \/e)=l 1 (l—py/e), 

giving 

_ h _ U __ /,+/, 

1—P\/ e e—p\/e 14* c — 

Thus we obtain 

/_ 1 —P\/g , _ e—py/e 

l+e—2py/e’ 2 l-|-e—2p-v/e 

Hence '-T+ «-» v< t(*— pVO /x+(«-pV«) f,] 

Evidently 

V ^~( 14 - t >— 2 py / e )* [ (*~ P >/ e ) 2 o 2 + («— P -\/*) 2 ~ 

+2 (1— p v 'e) (e—py/e) pc 2 .~] 

r v <e J 

Numerator=a 2 j^i -f-pg»-. 9^/,._!■ g_‘+p-g— 2g\/e»p 

+ 2 (1— p v 'e) (e—pyV) 

= ° 2 [(H-4?—2pvV)-p2 (l+ e _2p v '£>)] 

=° 2 0—P*) (1 4-«—2?^) 

Then 1 (/)^— so that 

i -he—2pv/ff 

^( 0 __ \-f ^ 

CT * (l-P*)4-(V*—P)*^ 1 ' 

The inequality holds iff P= v 'e. This implies that the corre¬ 
lation between /, and t 2 is y/e. 

Ex. 4. Assume that /, and t 2 are two unbiased statistics having 
the same variance. Then their correlation is^>2e—l, where e is 

the ratio of the variance of the best estimate to the common variance 
of t\ and t 2 . 

For, define the best statistic / such that /=i (f a 4-fj). Then 
V (0=i [F(r x )4- V(t 9 )+2 Cov (/„ / 2 )j 
= i W (6)4- V (f s )4-2p y/{V (t t ) V (/,)} 


Estimation 


169 


i [2V <l+p)]=^i*±?> 


where K(/ 1 )=K {t 2 )=V, say. 
Our hypothesis says that 


Minimum variance 

- v - 


which, in view of V (/)= — P ^ , yields : 


lip 

V (r)=Minimum variance. 

2e 

The fact that V (r)^minimum variance leads to obtain 

wh>ch furnishes : p>2e—1, and we are done. 

Ex. 5. Assume that X is a random variable normally distri¬ 
buted with mean zero and unknown standard deviation a. Consider 
as an estimate of a, the mean deviation 

1 n 

d=— Z \Xi\. 
n /=1 

Is it unbiased ? 

Evidently 

r(|x|) -T7b/-J*' eJtp Hr*)* v 

IT x exp ( - v2/2 ^ Jx 

b y se,Iin S .v 2 /2<j s =r 

= <V(2/~) 

Then £ (</)=-!-£ (j; | | J!" (^ ( | A', | )] 

= Oy/(2/n) 

Then E W(*/2) d} = a and hence y/(ir/2) d is the unbiase 
estimate of o. 

Now V (mean deviation) 

= f° ~d 2 /(x)dx=C 2 =* - 

J-OO 77 J 




_ a 2 _ X <,* / 2 \ 

~2i, n ~~T ■ v( '-r) 


“sr 



170 


Mathematical Statistics 


This shows that the variance of the estimate of a from the 
mean deviation is approximately — (^—2). 

Ex. 6. Let O n : x lt x 2 ,...x n be a random sample from a normal 


population N (p, 1). Then t— — 27 xf is an unbiased estimator 

n j = 1 

o/fJ 4 - 1 . 

By hypothesis, 

E(xj)=P, V(xj)= 1, F)= 1, 1. n. 

In fact, V [X))=E ( xf)—{E (*y)} 2 , which gives 
E (xf) = V (Xj)-\-{E C Xj)y= 1 + p~ 


Now E(t) = E (— 27 */=) = — E E(xf) 

K ”j= 1 n 7=1 




27 (l+,i*) = -~n(l+,.») 

J = 1 n 

= 1 +^ 

This proves that t is an unbiased estimator of/x 2 -f 1. 

‘4. Likelihood Function 

Definition. Assume that the frequency function of the conti¬ 
nuous or discrete population is f (x/d). Then the Llkeiihood 
function of a sample of n independent observations is defined by 

l C x u A'.,,..., x n \e)=f( X} \e)nx 2 \0)..f(x n \e) ...(i) 

This terminology is due to R. A. Fisher. 


n 


In symbols : L= II / (x/0) 

;= 1 


...( 2 ) 


n 

Theorem 3. Assume that L — il / (a/!^) is the likelihood 

j=l 

function of a sample of n independent observations. Then 



Proof. As a matter of fact, L is the joint frequency functions 
of n independent observations. Then evidently 

J...J L dx 1 ...dx n = 1 ...(4) 

Assume that the first two derivatives of L with regard to 0 

exist for each 0. Then differentiation of both sides of (4) leads to 
obtain 



Estimation 


171 


iw 


dx l ...dx n —0 


...(5) 


provided that the limits of integration are independent of 8 iNote 
that an interchange of the operations of differentiation and integra¬ 
tion is justified in view of uniform convergence of the integral. 

In view of replacing by ( — L or by ( ) L, (5) 

assumes the form 

d lop L 


( 


be 


) = j " l [ r ^) LdXi - dx " =0 - (6) 


Differentiation of (6; with respect to 0 yields : 

which can be put into the foim 

a 2 iog_^j 


do¬ 


ji dx 1 ...dx n =0 


i-mh' 

which furnishes : 

HPW 1 I- f(^‘) 1 

This proves (3). 

Theorem 4. (Cramer-Rao inequality). Assume that L is the 
^•kelihood function of a sample of n independent observations of a 
continuous population. Lei t be an unbiased estimator of some 
function of 8, say 0 ( 8). Then 

W (W 




-n^n 


...(7) 


(7) is the fundamental inequality lor the variance of an 
estimator, generally known as the Cramer Jiao inequality. 

Proof. Our hypothesis reads that E (/) = «// (tf;, that is, 

£(/)=/... J t Ldx 1 ...dx h = i\,(0) ...(8) 

Differentiating (8) with respect to 8, we find 

{‘"1 1 ~~’cu L L (lx l-- iix n^='h' ("I •••( ) 

i) lou L 


Now £ 14, (0) | (to). £ l*—^) 


cu 

b log L 


dU 

= 0 ny (6) 



172 


Mathematical Statistics 


That is, J...J 0 ( 0 ) L dx x ...dx„= 0 ...(10) 

Hence (9) in conjunction with (10) gives 

v (*)=/.../ {(t~* L d Xl ...dx n ...(11) 

Recalling Cauchy-Schwarz inequaliiy for integrals : 

(/•••//g </*„)*<(/...//2 dxi^.dx,,) (J...J g 1 dxi...dx n ), 

from (11) we have 


(«)P L dx 1 ...d.r„.j...jf 8 J2liy L dXl ..M 

'■ e - IV {t-4, (6)}*.E '{j -—')’], 

which provides : 


v(t)^E{t-^ (8))^{f (e)}>IE 

This is what we wished to show. 

In view of the preceding theorem, the inequality (7) assumes 
the form 



Remark. ( 7 ) and ( 12 ) are the minimum variance bound 
(abbreviated to MVB) for the estimation of ip (0). Now we have 
the’following : 

Definition. An estimator which attains the minimum vari¬ 
ance bound defined by (7) and (12) is termed a MVB estimator. 


It is important to bear in mind that the condition, under 
which the MVB was derived, is the non-dependence of the range 
°f/(* I 0) upon 0 and is unnecessarily restrictive. The condition 
that (6) holds for the MVB (7) to follow is only necessary. The 
MVB (7) assumes the form (11) in case (3) holds. 

Deductions of Theorem 4. 


D, Assume that / is estimating 0 

Then E (r) = 0 => 0 (0)=0 => (0)=1 

Hence for an unbiased estimator of 0, (7) assumes the form 

The quantity I defined as 


/=£ 



g log L 

ctf 




Estimation 

—I*. b.„J” 

—* f ^JL 311 L™h- „ . 

(! 5 ) is the condition under whiV'h u but ,s a function of 0 

Proof - :r ; - Bed - 

> W(?;}*/ £^Usl£\‘1 

“ ffi° f ““ Cauch >-Schwarz 

Cauchy-Schwarz inequality helnm SUffic,enI cond '“°n that the 
Proportional to (a | 0 ] ■ n° is ,hat '-*(*) is 

c °ndition may be Zfcn7l ^ *" “ M ° f ^serrations.* /he 

9 Jog L 

. . ctf-^ 

° f# The ^r mu ,ation of S)°-"ir° nS bU ‘ m “ y be * func ' io " 

JJ^g L 

dd =A <°) i*—M} 

which yields: ...(17) 

V ( 9 >°g L \ 

)-?[*(*) {'-</' (0)}]={4(d)*} y (t) 

A s a matter of fact ' -..(18) 

w hich, by dint of ( 6), becomes 1 * '' 

/A « 


so (hat (18) giv es 



r f/ a 'ok Z.\n 

) |={zi(9)}> ^(,) 

♦ * I_ 


T he fact that the Gu4v c,k . —(19) 

quality imp | fe sthat t Cramev ^*" b ' com - an 

equality. ame r ~ J *ao mquality alro becomes an 

Hence we have 


r(/)= 


( 0 '(W j E 


9 log £\ 2 


WiiuiaatiDg E ^ 9 ,Q « a j 


a* 


n 


174 


Mathematical Statistics 


between (19) and (20) leads to obtain 

{K(/M(for-=ww. 

which yields : 

V (t)=V(P)lA{0). 

This complete the proof of D 2 . 

Conclusion t is a MVB estimator of «/<(0) with variance given 
by (15) provided that (17) holds. 

Remark. Let «/» ( 0) = 6 . The (15) furnishes : 

K(/)=l M(»). 

which is equal to 

Illustrative Examples. 

Ex. 1 7 he mean of a random sample x a » ...x„, drawn from <* 
normal population 

exp -{-* r-r-T’l dx ’ 

where a is known, is the MVB estimator of 6 with variance afn. 

For, definition of the likelihood function provides : 

n 

/_= n f(x, 1 0) 

j =1 

1 n 

Then log /. = const.— — 2 Xj — 9)-, 

2ct " j—\ 


giving : 


t log L 

do 


n 


J 


E (xj - 6) 
7=1 


” -0)/(o'-/n) 


This is the form of 


d log L 

dU 


=A(.e){t-4,(6)) 


n 


with t = x, ii,(d) = 6 } .* 1 ( 0 )=—. 

The pro'es our assertion. 

Ex. 2 There exists no M *'B estimator of 0 in 


1 


df yx) = -- 


dx 


r. {l + v* - 0) 2 } 


, — oc < X < oo 



Estimation 


175 


1 u 

For, evidently L = — n 


1 


~ n j= 1 {H-U-OJ*} 

7 


u 


Then log L const. 2 log {1 (x—6) 2 } which provides : 

7=1 

d,osL . =2 z 


dt) 


j= 1 {1 “K*/ — 0) 2 } 

This can not be thrown into the form 

d log L, 

~^~=A(0) {t-*(0)}, 

and hence the conclusion follows there and then. 

butte*' 3 ‘ Th€ mean ° f Q raUd ° m SQmpU from the Poisson disirt ' 

f(x\d)=e-° 0 */ x \,x = 0,1, 2,...: oo 

,V t le MVB es 'imc*tor of 0 with variance 0/n. 

For, evidently 


Z * i 

-Z0 y=I 
* / (n) 

X l I *2 1. x t ! 


so that log L— const.-f- ^ z Xj\\og0—Zd 

\j— 1 / j 

Then d log L — n * u , 

,hen ~o — r~ n== ~u <*-*). 

This sh ows that x i. the ^ estimator of varjance 
- * 0/0 ■** 9 


2(r/7) 


= (r)« r O-«/'^ > ,=0.,,2.„ 


For, evidently 

Jog 2=const.+r log 0+( n —r) log 

Then — g = r /0~ n — r =, r ~~ Un 
cU l-U 0(l-o) 


rt 


0(1-0) 


(r n ~°) 



Mathematical Statistics 


176 


This claims the form 

0 J^ir=^ (9 ) {i-m- 

Hence the conclusion of the problem follows. 

Ex. 5. In estimating Q in the normal distribution 

ilF(x)= ■ - T ) l;r exp. (-xW-) dx, -oc<x<co, 

— 2 x 2 is a MVB estimator of 6- (the variance of the papula- 
n 

lion) with the sampling variance 2 hut there exists no MVB 
estimator of 0 itself 
l or, evidently 


V v 3 


log L = n log (l/tf)4-const. — 


20 


8 log L n , _ ni 1 ^ Y 2 . 

rhen - v *) 

so that A (V)= 6 '‘-. / = S x-, <jj( 0 )= 6 - 
This proves our assertion. 

Ex. 6. In samples from Type III Pearsonian population 
ilF (x)= ' 


x p *e dx t p> 0, 0<.v^co # 


np) 

x/p is the M VB estimotor of 0 for f ixed p with variance tf-!np whereas 
if d— 1, I 2 log x j ^I n with variance j ^ ^ | j n * s ^ ,e 


estimator of ,— log J\p). 

■ op 

l or. evidently 


L = 


n 


IT Xj 


P -1 -xj!$ 


[Tip)}" W* =l 


,e 


which yields : 


n i '» 

\og L= — n \og I\p)—np \og d + (p—\) 2 log x t -—2 xj 

/«1 0 j =i 

n 

2 xj 

, r , j=l np lx .\ 

Then — log L -f ^ 


n 



Estimation 


1?? 


Comparing with A (0) {t —^ (0)} leads to obtain 
<4 W=f£-. ‘=f~ and 4 , (fl)=0. 

Then 

Placing 0=1 in (1) yields : 

n n 

log L— —n log r(p)—(p—l) Z log xj — 27 xj 

;'=* J-i 


Then 


_0_ 


a " 

log L— — n — log rO)-h 27 log x, 

^ j = l 

=n 27 log xj— ~ log /’’(p)] 

L n j=i ? p J' 


/ 


which, in comparision with 

^ log L=A (p) {t— (p)} 
leads to find 


/ 


/ 


1 


c 


with V (t ) 


A (P)=n/ t= J r /27-log 0 (p)=l 0 

£i^=±^4r (P ). 


lOg T(/7) 


/I (p) n rfp‘ 

This completes th^ pfoof of our assertion. 

Theorem 5 The necessary and sufficient condition that a 
distribution admits life estimation of a suitably chosen function oj 
the parameter with Variance equal to the information limit is that 

L {Xu xJ,,., x n | 0 )=h {x u x., . x n ) exp (f0! + 0 a ) 

where 0 X and 0 * dfefpnctions of 0. 

Proof. Thl{condition is necessary. To establish this, we 
recall the Cau<my-Schwarz’s inequality : 


J(r-i («)>*/. 4 

where J dxi...dx H . 

jcquality isyattained only if 


L dx< 


[jU—A («)} l 


dL , , 

f* • L dx 


r 


The 


. . , , 1 dL 

'-+w-*rw 


whereAjlsa constant depending solely on 0. Integrating with 
respect to 0 leads to obtain 



*171 


Mathematical Stctistics 


">■ *-j[f-*?]« 

«= A -f- tB\ Q % 

where 6 , and 6 t are functions of 6 but A is independent of 0. 

Then L (Xj, a 2 ,..., x n | 0)=h (a - *, a 2 ,..., a - ,,) exp (/0 x -f 0 a ), 
proving the necessary condition. 

The condition is sufficient. Assume that the condition : 

T | c?)=/i (A*j,...A n ) exp (/ d\ -j-6 2 ) 

holds. Then we shall show that the information limit is attain¬ 
able. 


As a matter of fact, 

1 =J L dx=l h exp (i^ + Pg) dx, 
which, on differentiating twice with respect to 0,, furnishes : 



\h t exp (/0O dx~-e 

dO x 

and 


so that 

J ht exp (td x +0J dx=- < Q 

dO x 

and 

(»».+« 


They imply that 



Evidently, V ( t) = E (t")-{E (r)} 2 = - 
Recall that 1(8)~v(T log L j. 




de x * 


Now L=h exp (tO,- \-e t ) gives 
log z.=iog h+tdi+e^ 

which, on differentiating with regard to 0 , provides: 

d . r dO. , d9« 

57 lo s £ ” r 11 


do do 


so that I 


W = V { 


log L 


)->■[ 


dffi d\ 
dd ^ do 



Estimation 


a 79 


L (dOjY 

dd t ~ \dd j ’ 

Thus the information limit for the estimation of (— d'djd^y) is 

U' i d * $ * Y( Jd 'Y 

W ( e )Y _ \dti \ doji _ (def / l dd. I 


/ ( 0 ) 


d*tfn i UUj l 

ddf \dJ / 
d*0 


do_ldVi\- d-d.. 


l a -h\ 

\du J ddr 

<**»% v „v (dOA " 1 J / d$,\ 

der- y,( ‘Hd7) m[ 

completing the proof of sufficiency 


Now r-*<»>-*±" 
which gives : 

H»-*W]-k[a±£]-k[a'J«£*] 

But V [r— 4 . (<?)J = {0* (fi)YtI. Then we have 

which implies that A=^——- so that, A is a unique function of 0. 
This asserts that t is unique as the best unbiased estimate of 

0 (<O. 


Theorem 6. Subject to certain general conditions of regularity, 

the mean square deviation E (0-0)- can never fall below a positive 
limit depending only on the distribution function F (x | 0) the size n 
of the sample and the bias b ( 0 ), In the particular case , when 

A 

0 is unbiased whatever be the true value of 0 in a non-degenerate 

interval A , the bias b ( 6 ) is identically zero , and the variance V ( 0 ) 
can never fall below a certain limit depending only F and n. 

Proof. We confine ourselves to the continuous distribution 

only. 


Recall that 0 is an unbiased estimate of 0 if £ ( 0 ) = 0 . Assume 
that an estimate 0 has a bias b (0) depending on 0 . Then we have 

E{i)-6 + b (0W (0), say. 



180 


Mathematical Statistics 


Now E (d-dy=E [($-«/. (0)+(0 (0)- 0)] 2 

+ ( 0 )- 0] 2 

where the cross-product term vanishes in view of E [ 0—0 (0)] 
being zero. 


Therefore, E (0-0)3 = V (0) + [0 (0)-0]«, which, in view of 
0+b (0)=0 (0), gives 

E($-ey-v (S)+{b (0)p. 

The Cramer-Rao inequality provides : 

(*)}*//(«) 

and hence setting 0 (b )=0 +( 0 ) yields : 

( 0 )} 2 // ( 0 ) 

*her, / (e)=n£ [(i^y]. 

Consequently, 


£■ ( 9)}2+{i + y (tf)}s// ( 9 ) 

, Mi+b’wr-it (0) 

Inis completes the proof ot the theorem. 

The reader will prove the theorem himself in the case of the 
discrete distribution. 

Minimum Varianes estimator. The equation 

d log L 

—^~=A{Q) {*- 0 ( 0 )} 


determines a condition on the frequency function under which a 
M\ B estimator of0 (0) exists. Assume that the frequency function 
is not of this form. Then there may still exist an estimator of0 (0), 
which has, in uniformity in 0, a smaller variance than any other 
estimator. Then the estimator is called a minimum variance (M.V) 
estimator. Equivalently, the least attainable variance may be 
greater than the MVB. Caution : Tue least attainable variance 
may be less than the MVB in case when the regularity conditions 
leading to the MVB do not hold. 

Bhattacharya’s Inequality 

Theorem 7. Assume that t is an unbiased estimator of some 
Junction of 0, sav 0 (0). Define ^ 

dl) r * 


Estimation 



Then 

s 5 

V(0> z Z ^)/ r p-V (,) 

r=l p = 1 

This inequality is due Bhattacharya. 

Theorem 8 . If a MV {minimum variance) estimator exists , it 
is always unique, irrespective of whether any bound is attained. 

Proof. To this end, assume t x and t 2 are MV unbiased esti¬ 
mators of *p (0), each with variance V. Construct the new 
estimator r 3 such that 

' 3 =H'i+'a) 

Then V (/*)= *'[*(/» + /,)] 

= i[V (ti)+V (/«)-}-2 Cov (/,, r 2 )] ...( 1 ) 

with which the estimator t 3 also estimates tp (0). 

As a matter of fact, 

Cov ('i, (*i—0) (/ 2 —0) £ dxt„.dx m 

Then by Cauchy-Schwarz’s inequality, 

Cov (/„ {t x -ipy L dx x ...dx n .f.. f Z. dxi...dx H )'t* 

<[V{ tl ) V (f 2 )J ,/2 
= K since K (/,) = K (r 2 ) = 
by dint of which, ( 1 ) yields : 

V(t»XV 

which is clearly a contradiction to our assumption that f, and t 2 
have MV unless the equality holds. This asserts that 

Cov (t v t 2 ) = V ...(2) 

This situation arises only when 

/i—0—A (0) (/ 8 —0), ...(3) 

that is, when the variables are proportional. This implies that 

Cov (t lt t 2 )=f...f A (0) {ti—ip) 2 L dx l ...dx ft 

=A (6) K(/ 1 )=A(0).^, 
which, in view of ( 2 ), leads to obtain 

A(0)=1. 

Then from (3), we find 

h =/ 2 

identically. Consequently a A/K estimator is unique. This 
finishes the proof. 

Remark. Assume that p is the correlation between any two 
estimates. Then p*^l. Reason, by definition, 

Cov (t lt r 2 ) Cov* (/„ t 9 ) 

\'{V (f,) V (r 2 )>' f ~ V (tl ) V{ ti y 



182 


Mathematical Statistics 


<1 since, in view of Cauchy-Schwarz’s inequality, Cov* (fa, /*) 
^ V (/,) V (t 2 ) 

Theorem 9. Assume that t is the MV unbiased estimator of 
'P (6) with variance V, and that t x and to are any two other unbiased 
estimators of ip (6). If V (t 1 )=k 1 K, V (t 2 )=ki V, k u k 2 >l then 

_ 1 , l (ki-l)(ko-\) )'l*^ 1 flfci-IXfca-m 1 * 

(kikjw ^ \ k x k 2 J * ? 'ik x k % W \ k x k 2 J 

...( 1 ) 

and ~ ence on setting E x = 1 /k t £ 2 = I jk 2 
(£i£ 2 ) 1/2 —{(1 — £,)( 1 — E a )}'l- <p<(fo) 1 ' 2 -f (1—1 — ■£’a) ...(2) 
and finally if either E x or £ a =7, i. e. either /j or t% is the MV 
estimator of <]; (0), P=£ 1/2 ...(3) 

where E is the reciprocal of the relative variance of the other esti¬ 
mator. 

Proof. We construct a new estimator t 3 such that 

/ a 

which estimates also (0) with variance 

V (t,)=a* K(f,) + (l-a)‘ V(Q+2a (l-a) Cov (/„ /,) ...(4) 

Our hypothesis reads that 

y ('i)=*i V, V (t 2 )=k 2 V , k u k 2 > 1 ...( ) 

Then (4) provides : 

—(1—o )2 k,-\-2a (1 — a) Cov (t u r a ) ...( 6 ) 

Recall that 

Cov (t x . r 2 ) 

{V Ui> V t/a)} 1 '" ’ 

This by d»nt of (2) yields : 

Cov (r,. r 2 ) 

l*!*«)““ V ’ 

by means of which, (6) assumes the form 

^Wfcj+O-a)* k 2 +2a (1-a) p...(7) 

In fact, V (^ 3 )>K, and hence (7) provides : 

« s A:i + (l—a ) 2 fc 3 -f?a (1 —a) P(k 1 k 2 y 11 ^ 1 , 
which, after simplification, gives 

« 2 {A’l+A'a — 2 p (krk.yi^ + lj (? (k l k 2 ) i l t -k t }-{-(k i ~ 1)^0 
The roots of this equation are complex or equal, and hence 
the discriminant of the left-hand side cannot be positive. There¬ 
fore, 

{P (k x k 2 yt*} l). 


Estimation 


183 


from which we obtain 

Consequent y, 

_ 1 » _((^,—i )^s— i )| 1/8 

(kik 2 ) lli '\ k x k t f 1 Ma J 

This proves (1). 

Letting /?!= l/fc lt E 2 — \/k 2 in the preceding inequality fur¬ 
nishes : 

(E 1 E 2 yi*+{(\-E J )(l-E 2 )yi^p^(E l E i yr--{(\-£,)(!-£*)}'!* 

proving (2). 

Setting E,=l, E z —E leads to obtain 

y/E=? 

This proves (3). 

15 5. Efficiency : 

Concept of efficiency in the point estimation 

Of all statistics which converge stochastically to a parameter 
6 , the practically important ones arc those which converge more 
rapidly. Such statistics give large deviations less frequently from 
the true value of the parameter at least in large samples. We, 
therefore, wish to find a criterion for juding which of the two 
statistics converge more rapidly. In case of as> mptotically nor¬ 
mally distributed statistics, the rapidity of convergence is 
measured by invariance or the reciprocal of the variance. Reason : 
In a normal distribution the probability of a departure exceeding A 
times the standard deviation is a decreasing function of A, /'. e. 

p { on| y 

This implies that the probability of a departure exceeding a 
given value decreases with decrease in the variance. Hence .he 
statistic with the smallest asymptotic variance is preferred and is 
called the efficient estimator. The efficiency of any other estimator 
can be measured bv the ratio of the variance of the efficient estimator 
to that of the other. 

Efficiency, though linked with the minimum variance, is essen¬ 
tially a large sample concept. The compansion is confined only 
to the class of statistics which are asymptotically normally 
distributed. There is no serious objection in view of the fact that 
a large class of statistics obeys this property. 



184 


Mathematical Statistics 


The Crame’r-Rao inequality gives 



This represents the smallest possible value of V(t). The effi¬ 
ciency is defined as 

e ^_The mimimum value of V(t) 

the actual value of V(t) 

We have always 0 < e < 1. 

When the sign of equality holds in (1), then V(t) attains its 

smallest possible value and we have e (/)=!. In this case we say 

A 

that t is an efficient estimator. 

Asymptotically efficient estimator. Assume that 


i i (*i» •••, x n ) 

is the regular unbiased estimator defined for all sufficient large 
values of n Now we wish to consider the asymptotic behaviour of 

A 

t as n ->• co. For this purpose suppose that t is an unbiased and 
most efficient estimator of the parametor 0 , and assume further 

that t\ is another unbiassed estimator of the paramejer. As a 

measure of efficiency of t\ ,the expression 

c _V{t) 

V (h) ( 2 ) 

may serve our purpose. For the most unbiased estimator we have 

e«=l. 

Often we deal with estimates which are not efficient but their 
efficiency satisfies the condition 


Lim 

n—> co 


e (0=1 


and they are atleast asymptotically unbiassed. 

called esymptotically most efficient estimates. 

From the practical point of view, they 
estimates from large samples. 


-(3) 

Such estimates are 
are most efficient 


H 




Note that the efficiency e may converge to a limit e 9 c 1 as. 

cc. 


Estimation 


185 


Then the relation (5) does not hold always. Now we have 


Lim 

n-> co 


e (O“*o(0= 


1 


-•* u 


a log f\* 


60 


r 


because the standard deviation of the estimator / is of order n~ 1 ' 2 

for large n and then V (r) ~ c n~ m where e, i.e. e 0 is called the 
asymptotic efficiency of the estimator. 

When e 0 (f)=l, t is termed an asymptotically efficient esti¬ 
mator of t. 

Illustrative Examples. 

Ex. 1. Examine the efficiency of the median as an estimate of 
the location parameter of (/) the normal disttibutlon and (//') Cauchy 
distribution. 

(i) We have 


K{median) = 


and 


V (mean) — 


2 
a* 
n 


n 


where a 2 is the parent variance. 

The efficiency of the median of the location parameter of the 
normal distribution 


_ o*/« 


— ~ 0*637, 

7T 


iTo^/ln 

In conclusion the sample median is not asymptotically the most 
efficient estimate of x ; alternatively the mean is more ctficent than 
the median for large n atleast. 

(ii) The Cauchy distribution is given by 


dF (x) = 1 


dx 




, —00 < X 5$ CO. 


Evidently, £=(— V II -!-— 


Then 


& \ogL 


n 

2 


2(x,~ 0) 


'*} 


= 0 


du j~l {‘-K a,— o)* 

is the likelihood equation of degree (2n- I) in t). Hence it is a 
difficult problem to solve. We may, however, find the asymptotic 

variance of the solution U from 

1 




— 



186 


Mathematical Statistics 


Now /=— 


1 


which yields : 


giving 


77 l-Hx-0) 2 

log/=eonst.—log {1 -f (a:—0 ) 2 }, 
d\o g/_ 2 (x-0) 


and 


Then 


at> 

8 2 lot? f 


1 + ( X _ 0)2 
2 (x-W-2 


r 


at/- (l + (*~W 
30 a 2 log / 1 

oo 


a # 2 


/- 4 j; 


2( X -6Y-2 a dy 

OC {!+(*-W 


-H 


OO 


* 2 -1 




o ( 1 +*V 

= 4_ j 00 n +^-2 


-HI 


o {i+x*y 
*12 
0 


dx 


cos 2 0 dx 


- 2 I 


OO 


cos 4 9 dO 




on substituting x=tan 0 

-H i t)' 


Hence K (r) = 


77 


The median of the sample t has large sample variance 


Hence e ^ 


77“ 

k W-55 

2/77 __ 8 

IT 2 1 4/1 7T 2 


0*81 approximately. 


Ex. 2. F/W f/7£ correlation coefficient between an inefficient 
estimate and an efficient one. 

For its solution see Theorem 9. 

Ex, 3. If X be a random variable normally distributed with 
mean zero and unknown standard deviation a. Consider as an esti¬ 
mate of a, the mean deviaiion 

1 n 

d=~ 27 I Xj | 

" 7=1 

Find its efficiency. 

In continuation of Ex. 4 § 15*3, we find 

E(\X\ )=oV(2l~), E(d)=ax / (2/ir), 

EW^H)d}~<j t 

a 8 (k — 2) 


V (mean deviation) = 


In 


Estimation 


187 


V(S. D )=oV{7n). 

Then efficiency= (a ^ )(/r _ 2) 

= -U. ~ 0-8760 

77 — Z 

Note that ds/(n/2) is the unbiassed estimate of a is with 
efficiency ]/(tc—2) 0 8760 

Remark. The efficiency of the unbiassed estimate d. \/( 7 r/ 2 ) 
remains the same if the mean zero is replaced by the mean y. 
because 

E{ | X-P | }=<jv/(7*), E W{*H).d}=* 

o _") \ 

V (mean deviation)= — -r^— 1 . 

Fx. 4 . I he variance of any regular unbias ed estimate p from 

a sample of n values drow n from the binomial distribution 

/ N \ k N-k 
p k={ k ) p q 

where N is a known integer and q=\—p is atleatt (pq/ Nn), and 
+ 

p is an efficient estimate of p. 

In any regular estimation case of the discrete type, the in¬ 
equality corresponding to the Crame’r—Rao inequality is 

a 1 

V{0) > - 


" F l ~~dJ j 


P A 


Here 9=p is the parameter to be estimated. 

d \oz p k \ 2 y ik ,_ 7V— k \ 
/ 0 7 ) 


Hence 27 ( 
k ' 


dp 


Pk 


Jv ikSP±Bh±P\\. = 


r ( 

0 ' 


PQ 


V N l 


Q 

k — N py 
PQ I 


Pk 


1 


( pqY o 

i 


N t 

£ \ 


k 


) 


k N — k., v ? 
p q (k — A 'pY 




_ TV/n/ _ TV 


(pQf “ (/»?)* PQ 

where p z is the second moment about the mean Np. Hence 

> 1 _ =_P± 

" (nN)lipq) UN 


V(p) 



188 


Mathematical Statistics 


This shows that the variance of any regular unbiased estimate 
p from a sample of n values is atleast equal to ^ 

Let p be the particular estimate. Then x=Np gives 

* x _ 1 
p ~ N nN j J - 

Then E (A j± 2 XJ }=± E (2 XJ ) 

• J 


n 1 n nNn 

= Z E(xj)=--Z Np=^-=p 

j= 1 Nn /=! 


Nn 


Vfa-r (± f *)- 


1 n 

Z V (x,) 


1 

since xj are independent observations 

_ nNpq _pq 
N*n- Nn 

This implies that p is an efficient estimate of the parameter p. 
Ex. 5. The variance of any regular unbiased estimate 

A of the parameter A from a sample of n values drawn from the 
Poisson distribution 

.-A 

k ! 


Pk=— - e 


is atleast equal to (A/n), and mean x— A, is an efficient estimate of A, 
Evidently, Z ( * '°f - ) 0 (t~ 1 ) Pk 


= ~ i (k-.\f-p k 

A k = 0 


I 

A“ ^ 


A* 


1 


A 

n ‘ 


Hence ^=^ 7 ^ = 

A _ 

For the particular estimate A=x 


Estimation 


189 


=27/n, E (A)=A and V{X)=*X/n. 

J 

This proves our assertion. 

Ex. 6 . The mean x is an efficient estimate of 6 in the normal 
population N (6, c 2 ) 

Evidently /(jr|g= exp (“(*“ 0) 2 /2a 2 } 

We wish to estimate the parameter ti. 


Note that a is a known constant. 

Let A be any non-degenerate interval. 
Then we find 1 


nmH-: m*" 

Hence 


V(0) > 


I 

n. l/o” 




For the particular estimate <5=x=S xj/n, 

E (U=x)=E (27 xj/n) = — £ (27 X;) 

n 

£ £ (x;)= 6 and F (0) = -!r 27 F (*/) 

nr 

rta 2 a 2 a _ 

— -«-*=—. This asserts that 0=x is an efficient esti- 
n‘ n 

mate of 0 in N ( 0 , a*) 

Ex. 7. 7%<? variance of any regular unbiased estimate of a- in 

/( * 1 c!)= v(^ exp 

w/?fcre /x is a known constant is 2 a 4 /' 1 . but -—^ 27 (x/— x ) 2 w nor 

efficient estimate of a* whereas the estimate s 0 2 = ~ S (xj — p) 2 pro¬ 
vides an efficient estimate of a*. 


Note that o* is the parameter to be estimated whereas p is a 
known constant. Let A be any non-degenerate interval and then 
select for A any finite interual a<o 2 <b such that a> 0. Evidently, 


[( 


d lop f 


do* 

Then V (a 2 )^2a*/n. 


rw:, 


2 1 
fdx ~2^ 



190 


Mathematical Statistics 


Consider the sample variance 

s z = — 27 (a;—*) 1 . 
n i 

Then applying the correction for s 2 for bias, we have 


n 


n — 1 


s 2 


~r 2 (*>-*)*. 

/J — 1 1 


which is an unbiased estimate of a 3 with the variance 
2(j i /(n—l). Clearly this is not an efficient estimate, but an estimate 

of efficiency —^—<1. 

On the other hand, the estimate j 0 2 = -i- 2 (*y—M-) 2 is legiti¬ 
mate because p. is a known constant. The reader will convince 
him.elf with verifying that j 0 * has the mean c 8 and the variance 
2 o 4 /n, and hence s 0 - provides an efficient estimate of a*. 

Ex. 8. The variance of any regular unbiased estimate a in the 
normal population 


/(* I <*) = 


1 


C V /(i7T) 


exp 


1 






- V \-" / \ " ' 

where ^ is a known constant is utleast *q m *al to o*l(2n). Assume 
is the standard deviation of the random sample of n observations 
from J (a | o). Define s' as follows : 

W't E 7 L ). 


s = 



r 


It) 


\ / 

Then the efficiency e (s') tends to 1 as n-*co, and for smaller n. 


the efficiency is smaller than l. Verify that e (*')= 0 -r =0 ^380 

L \U - Z) 

for n==2, and= — -f -—= 0 6l00/br n = 3. Construct the expre- 

0 (4 — 7r) 

o' such tnat s Q '=J\^^j —1 j J ° w ^ ere ’ >0 = [n~ ^ j 


ssion s 


7hen s 0 ' is an unbiased estimate of a with variance — -\-Q 




and 


fficUncy e (s 0 ’)-+l as n-+oc. Verify that e (s 0 ')= 


IT 


4(4 -*) 


= 0-9151 


Estimation 


191 


for n=2, and = 


— 0 9358 for n —3. Draw your conclusion 


, 3(j*-8) 

oy comparing e ( 5 ') with e (s 0 r ). 

Our hypothesis : 

/(* | o)= -1— exp. i— ) 

ay/(In) | Zc* J 

iorm which we find 

l°g/= const. — log q— 


2 a 2 


giving _!_+(*= 4 )! 

C q 3 


OO 

— 2/a*. 

The reader will verify himself the truth of this fact. Hence 


^(<0 > 


a 2 


n-2/o- 2n 

This shows that the variance of any regular unbiased estimate 
a is atleast equal to <j 2 / 2 / 7 . 

In view of our hypothesis, we have 

s'-V( n /2) 1 )/2 ■ , 

where j is the S. D. of the sample. 

Then £ (r')=V(«/2)^=iV£ £ (j) 


and 


—\/(n/2) . J~(n/2) 

r(n/i> W/n) a = o 

k fv^r fr-- | y»/v—n/ 2 , •]», 

L T- (,,/2) J 

2 >( n 4 ) 


Hence ^,- 2 4/g +0 (-L)J^, asn 


00 . 


Let n=2. Then K (S;=o«/(2/i)=(o»/4) 

and K(x')=r 2 — -CL<2— 1 >/2 ,1 

L 2 -• J 

Then e(i'(= _ I 

{7<U-t)c> 2(»r—2/ 

Again let «-=3 Then Vw)-„‘/6 


o 2 =(t r/2—l)q‘ 


= 0 4380 



192 


Mat he mat ical Statis tics 


and 


Hence 


+:-l- 


a 2 6 (4-77) 


=0*6100 


Evidently, E (V)=0, showing that s' 0 is an unbiassed estimate 
of a, with variance 

Then ?(s' 0 ) -> 1 as « oo. Verification for n=2. and n = 3 is 

easv exercise for the reader. 

* 

Ex 9. It t ts the mosl efiicient estimator ond t x a less efficient 
estimator with efficiency E and if the correlation of t t and h is p, 
show by considerieg the estimator t» defined by 

(1+E-2?VE) / 2 =(/- ?n /£) t+{E-?y/E) fi. 

that o = y/E. 

What conclusion can be drawn regarding the compounding of an 
efficient and an inefficient sta.’islic ? Show that t and t\ t are un¬ 
correlated. [Delhi 1964] 

If t and /1 approach 0 in probability sense so does t 2 • By 
hypothesis, 

(1 +£—2p\/£) t 2 =( 1—PV^) t+{E-?y/E) *i 

Then V[(\ +£-2 ?v /£)} / a = F{(1 -py/E) t + {E-?y/E)h) 

i. e. (14-£-2 ?v /£) 2 K(r a ) = ( \-?y/E)* V(t)-h{E-?VE)~V(ti) 

4-2 (1 — p\/ E)(E— py/E) Cov (/, tj 

...d) 


Difine V(t)—V. Then E= 




V(t x )= VIE, 


and hence 


cov (r, r,) 

P y/iVlOVfa)} 

yields : cov (t, t l ) = p V/y/E. 

Then (1) becomes 

„ „ , V {(i - ?VE)- 4- ( v / E- p ) 2 4- 2( I -? y/ EM y/ E- p)p} 

V { Tl + E—2?y/EY 


_F(l-p2 ) (14-£— 2?y/E) 
(l 4- E —2p\/ £)* 

V( \ — P 2 ) _ V( l - P*) 


14-£—2p\/ E (1— p*>4-v.P — y/E) 


- < V 


00 



Estimation 


193 


In fact, F(f a ) < V which, in view of (ii) claims that F(/ a )=F. 
Therefore we find 

1 ~P 2== (l —P*)-f- ( P— v/ E), 
which yields : ?=>\/E 

Let r, be an efficient estimate whereas f 2 any regular unbia¬ 
sed estimate of efficiency E > 0. Then compounding these 

statistics will lead to obtain that the correlation batween r, and t A 
is also y/E. Reason : We construct a new regular unbiased esti-* 

mate t such that 

f=(1 —A) /ji -f-A r a 


Then t has the variance 

V(0=[ <i-v+&g=lL+».]y ( t 

V(t ) 

where E= -- 1 , which can be written as 

V&) 

K( 0 =[i + 2 A-£^£ +a 2 K(?i) 

Let p ^y/E. Then giving A a sufficiently small or negative 


value, we see that the coefficient of F(f,) < 1. This implies that 

A A 

^ (/) < ^(*i) which, in turn, implies that the efficiency of (r) would 
be > 1, which is impossible. Hence p =y/E. 

Evidently, 
which provides : 



which simplifies to 


V(t x )~V{t+(fi-‘)h 

^( J i)= y(t)+ V(ti—t)+2 cov. (r, /,_/) 

=* y(0+ V(fi)+ V(t)— 2 cov. ( t , r,) 

+ 2 cov. (/, r). 


Now 


V(t)-cov. (r, r,)+cov (/, /)=0 

0 = cov - (' . M _ cov (/, r,) 

.~ * ? 


Vi^(0 K(/,)} 


7 K,Q.F(r) U 


cov (/, r,) 
K(/) 




...(3) 


which, in view, of p = y/E, claims that cov (r, /,)=F(/), by dint 



194 


Mathematical Statistics 


of which, (3) shows that 

cov (f, — /)=0 

This asserts that / and t x -t are uncorrelated. 

Ex. 10. The two efficient estimators for a certain parameter 

must he perfectly correlated. 

For, assume that I , and h are two efficient estimators for a 

parameter d with least variance V. 

Then we construct any new regular unbiased estimator t su 


that 


/“l (^1 1 2 ) 


so that 

= i [K(f,) + F(/ a )4-2 cov (t, /,)] > v - 
Then \ [V +- V+ 2?V] > V, where p is the coefficient of corre¬ 
lation between t x and t. 2 . 

This yields : 

1+P > 2, i. e. ? > 1. 

Remember that ?> 1, and hence ?=1. This proves the 

proposition. . 

Ex. 11. Cite an example of estimators which are (a) unbiased 

and efficient ( b) unbiased and inefficient , and (c) biased and 

inefficient. 

(a) The sample mean x and the modified sample variance 
U -^—z are two such examples. (Vide Ex. 6 and Ex. 7). 

m —1 

(b) The sample median and the sample satistic \{Qi-frQ$U 
where Q , and (? 3 are the lower and the upper sample quartiles are 
unbiased estimates of the population mean because the mean o 
their sampling distribution is the population mean. Theretore 
they provide the examples of the desired estimates which are 
unbiased and inefficient. According to Ex.1, the eft iciency of the 


A 

s' z — 


7 

median of the population mean is — < 1 . 

77 

(c) The sample standard deviation s, the modified standard 
deviation, the mean deviation and the semi-interquart'le range are 
the requested examples of biased and an efficient estimates 
(Vide Ex. 7). 

15‘6 Notion of Sufficiency : The criterion of sufficiency which is 
due to Fisher (lv21a 1925), supersedes the criteria of consistency, 

unbiasedness, minimum variance and efficiency. 

First of all we shall consider the estimation of a single 



Estimation 


195 


parameter 0. There exists an unlimited number of possible 
estimators of 0, from among which we must select. Let us take a 
sample of n>2 observations and then consider the joint distri¬ 
bution of a set of r functionally independent estimators 

j r (f» fj, ^2*• •• » f*— i I r—2, 3,..., n 
where the estimator / has been chosen for special consideration. 

We recall the multiplication theorem that the probability of 
two proportions q and r on data is the product of the probability 
of q given p and that of r given q and p\ that is, in symbols 


E {qr \ p) — P [q \ p) P {r \ qp) ...( 4 ) 

Writing q = t , r=/, t r _ l% and p=0 in (1) leads to obtain 

fr (t, / r _, I 0)=g (t I ft) h r - x (/!. tr-x \ t, 0) ...(2) 

Note that % (t \ 6) is the marginal distribution of f, and 


hr-\ I /, 0) the conditional distribution of the other 

estimators given t. 

Now we assume that the last factor h r -i is independent of 0. 
Then we encounter a situation in which, given t, the set f a ,...,r r _, 
contribute nothing further to our knowledge of 0. Assume further 
that this holds true for each r and any set of (r— I) estimators 
Then we are justified in saying that t contains all the information 
in the sample about 6, and we therefore call it a sufficient statistic 
for 0. Now we have the following. 

Definition. An estimator t is defined as sufficient for the 
parameter 0 iff 

/ r (^» f*»***» fr—l I ^) = g (t | 0) hr—i fr —i | t) •••(3) 

where h r _ ; is independent of 0, holds for r = 2 , 3 ,..., n and any 
choice of 

This definition is usually given only for r= 2, but the defini¬ 
tion for each r seems to us more natural. It adds no further 
restriction to the concepts of sulficiency. 

An alternative definition of a sufficient statistic is as follows : 

A statistic t is sufficient for the parameter 0 if with regard to 
any other statistic t v the joint probability density 

E{t.fi)'=Pi{t,0}Pz{t l \t} ...( 4 ) 

where P t (/, 0) is the probability density for / and P* (/, | t), the 
relative probability density of t x given t, is independent of 0. 

In terms the conrept of the likelihood function, (3) may be 
written 

L (xi, x,,..., x n I 6 ) a (t \ 0) k i_x i,.*., x„ ) •••(4) 

where g (t | 0) is the marginal distribution of t UQd k is indepen- 






196 


Mathematical Statistics 


dent of 8. Hence t is sufficient for 8 iff (4) holds. 

Theorem 10. f r (t, I 8)=g (t I 8) h r -\ f r - 1 | t), 

which is the joint distribution of a set of r functionally independent 
estimators is a necessary and su fficient condition for 

L (Xj, x 2 ,..* I 8)—g (/|0) k *»)• 

Deduction. A sufficient statistic provides the MVB estimator 
where there is one. 

Proof. The necessary and sufficient condition for suffici¬ 
ency is 

L (xi,...x n | 8)=g (/ | 0) k (xj..., x n ), 
which, on taking logarithms of both sides, yields : 

log L=log g (f| 0 )+log k (x!,...x n ). 

Differentiating this with regard to 8 leads to obtain 

a log L _d log g (/ | 8) (4 v 

do dd 

In lieu of (16) § 15 4, the condition that a MVB estimator of 
(8) exists is 

(#) {'-</, ( 0 )} ...( 5 ) 

Then we have 

a l° g g (t i e) ^ A ^ ^ (6) 

ou 

This proves the deduction. 

Note that (5) is a'special case of (4) in view of ( 6 ). Sufficiency, 
which a first sight seems a more restrictive criterion than the 
attainment of the MVB, in infact a less restrictive one. Reason : 
(5) holds implies (4) holds, but even if (5) does not hold we may 
still have a sufficient statistic 

Theorem 11 . The sufficient statistic t is unique , except that if 
t is sufficient , any one-to-one function of t will also be sufficient. 

Proof. Recall that a statistic is sufficient for 8 iff 

L (xj, •. «x„ | 0)=g (t I 6) k (Xj..., Xn). 

Setting t = t (u) implies that 

k (x) 

‘L (x 1 ,...x /l | 8 )=g (t\ 6)- | t’ (u) | - - j 

=gi (w, 8) (x), 

where g x (u, 8) is the frequency function of u, and k x is indepen* 
dent of 8. Hence u is also sufficient for 0. 

To show that the sufficient statistic is unique, assume that t x 


Estimation 


197 


and / a are two distinct sufficient statistics for 9. Then with r—2 

fr tlf’tr-i | 9) = g (t | 6) h r _i j I t) 

yields : 

f* (tu h I 0)=gi (t t | 9) h x (t 2 l t,)=g 2 (/ 2 J 6) h 2 (7j | t 2 \ 

which implies the functional relationship : 

fi = A (/ a , 0 ) ...(l) 

Note that t 1 and t 9 are functions of the observations only, 
and not of 0. Hence from ( 1 ), is functionally related to t 2 [ 
This establishes the uniqueness of the sufficient statistic. 

Counter Example. We cite ihe following example in support 

of the statement that functions of t which are not one-to-one may 
also be sufficient : 

‘For a normal distribution with mean and variance both 
equal to 9, for single observation x and x 2 are both sufficient for 9, 
x 2 being minimal*. 

Theorem 12. {due to C. R. Rao and Black Well). Irrespective 
of any variance bound, the minimum variance unbiased estimator of 
(#) is always a function of the sufficient statistic, if one exists. 

Proof. Assume t is sufficient for 9, and t x is another statistic 
with E(ti)=lJl(0l fj) 

Recall that the necessary and sufficient condition for suffi¬ 
ciency is 

L x„ | 0)~g (t | 6) k (Xj. Xn ) 

Then we have from (I), 

«/» (0) = J...J t l Ldx x ...dx n 

=/.../ ti g (g \ 0) k (x„) dx l ...dx n ...(ii) 

Transforming the right-hand side of (ii) to a new set of vari¬ 
ables t, t 2 ,... y x„ and integrating out the last (n~l) of these, wc 
obtain 

0 (0)=J p (/) g (t | 0) dt ...(jjj) 

Then (iii> claims that there exists a function // (t) of the 
sufficient statistic which is unbiased for 0 ( 0 ) 

Evidently ,p{t) = E{t l | /) 

Now V Ui) = E {/,—0 (0 )} 2 

= E {{ti-p (/R(P (0—V' (0 ))} 2 
= £ {h-p ( t)}*+E {p (/)-0 (0)}= 
because £ (/)) (p (r )_0 ( 0 ))}=o 

on taking the conditional expectation given /. Hence 

v ('i)=£ {fi-P (/)}*+ V{p (/)> 

>Y{p(t)} 



J 98 


Mathematical Statistics 


This completes the proof of the proposition. 

Deduction. There exists a unique function p (I) which is a 

MV unbiased estimator of 4* (#) 

Proof. Theorem 12 says that the MV estimator of 4> (6>) is unique. 
Then the deduction is immediate provided that there is a sufficient 
statistic / for 0 , and an unbiased estimator of ( 0 ) exists. 
Illustrative Examples 

Ex. 1. What is the most general form of distribution aijjeren- 
liable in 6, for which the sample mean is given by log L=0 ? 

By hypothesis, a solution of 

is e=— 27 x, 

n 

i. e. /i0=27.y, which implies 

Z (.y- 0 )=O 
Then we claim that 

7 != a 

This holds for all x and 6 , but A is independent of x but may 
depend on 0, and let A be equal to Then integrating with 

regard to 0 leads to obtain 

iog/=J (.v-e) yji de 

w +ftr ^ +const ' 

=(*-#) ^-+0(O)+«x), 

where £ (x) is an arbitrary function of x. Therefore 

f(x)=k exp |(.x- 6 ) ||-+V. (f >)+5 (*)} 
which is the most general form of/(x). In view of letting 

0 ( 0 )=* 0 a , *(*)«-* 

( 1 ) assumes the form 

f(x)=k exp{-£ (x- 0 )*} 

which is the normal distribution. 

Ex. 2 The largest observation x<„) in a sample of n indepen¬ 
dent observations drawn from the population defined by 



Estimation 


199 


dF (x) = dx/9, O^x^e 

is sufficient for 6 but it is not an unbiased estimator of 0. 

Evidently the likelihood function is 

L (X 1 0)e=0“* 

This does not explicity contain the observations. But we 
can determine a sufficient statistic for 6. Recalling the distribution 
of the rth order statistic x f , we have 

Letting r=n yields the following distribution of the largest 
observation x (n) in a sample of n independent observations : 

do„ (*<„>)= ln 4ryT *<*«■>> dXM - (2) 

Our hypothesis furnishes : 


dF (X(„>) = 


_dx {n) 


0 


which implies that 


F(x„)=^-, j- 

Hence (2) assumes the form 

dG„ (x,„,=n (^)" 1 • y dx M , 

which, after slight simplification, becomes 

dG n (x (0) )«=nx ( „)"- 1 dx (n) l6 n =g (x (n ,/0) d.x ln) , 
by dint of which, (1) may be written as 

£(x|<0=*(*<»>l<O n y^r 

This satisfies the condition for sufficiency : 

L (Xj, x 2 ...x* I 0)=g (t \ 0) k (x,,• • • -v„ v 
and hence x ( „) is sufficient for 6. But x (rt) is not an unbiased 

estimator of 0. 

Remark. When the range of/(x|0) depends on 0 one must 
be careful in verifying that g (f 1 6) is a frequency function. 

P (Xi,.»«, Xn) ^ 


Then L (x I 0) = 


for any function 


0" P (x lt ..., x„) 
p But only g (x<„, I 6) will provide the sufficient sta- 

tistic. 

Ex. 3. For the rectangular distribution 
dF (x)=dxl{2d), -0<x<0, 

the upper terminal of the range is a monotone decreasing function 



200 


Mathematical Statistics 


of the lower terminal {—(f). Hence the sufficient statistic for (— 6) 
is r=min (a* (J> — x {n) ) 

that is, for 6 itself, 

/'=—/=max ( | x n) | , | *< n) | ) 
is the desired sufficient statistic. 

Ex. 4. A smallest observation, at (1) , drawn from the distribution 
dF (. x)=exp {— (x— a)} dx , a<X<co 
is sufficient for the lower terminal • 

For, as a matter of fact, 

/(x)=exp {— (x— a)} 
can be written as 

/(*)=exp (—.v)/exp (-a) 
which is of the form 

f (X | a)=g (*)//; (a) 

Hence our assertion follows immediately 

Ex. 5. There exists no single sufficient statistic for 0 when 
n^z2 in the distribution 

dF (x)oc exp (— x&) dx, 

For, the frequency function f {x \ 6) can not be put in the form 

/(* I C)=g (x)/h (0). 

Ex. 6. In the two-parameter distribution 


JF(x) = Jx/(P-a), 

given jS, x (1) is sufficient for a; and given a, x (n ) is sufficient for /?. 
Hence a (1 ) is the smallest observation whereas X(„> the largest one. 
Argument. Evidently .v (l ) and x (2 ) are a set of jointly sufficient 

statistics for a and/3. Recall the joint distribution of.v (r ) and 

X( A ), r<s, we have 

dC I ,,)}’-'- 1 { I dF ( x „) 

B (r, s—r) B (s, n—s+l) 



Evidently F (Xi r ))= -~L F (*,„=*,„/(f5 -«) 

— *) 

Letting r= 1, s=n in (I) yields : 

.!(■ _ {- y (h) •*(!)]” - dh dF (X( n ) I ,q ■»„ 

11 " mi,«-l) B(n, 1) / (P 

from which we obtain 


_ g x M ) = n (n-\) (x (n) — x ( ,Q"- 2 /(p-«) n _ 

* See (14*2) § 14*2, M. G. Kendall and A. Stuait, 

The Advanced theory of Statistics Vol I, Charles Griffin and 
Company Limited, London. 


Estimation 


201 


Hence L (x | a, £) = (£—a ) n =g (* (1) , x (n) ) k (x) 

This proves our assertion. 

Ex. 7. In a random sampling from a normal population 

N (n, a), 

(a) x is sufficient for \i when a 3 is known 

(b) — 27 (xj — /x ) 3 is sufficient for g 2 when fi is known 

n i 




Z (xj—x) 2 is not sufficient for o 2 alone 

i 



j_L £ (xy—/x) 2 p is sufficient for a when fi is known 


(e) x and s 2 are jointly sufficient for fi and a 2 
For, N {fty c) provides : 

/<* I »*• o2) =^W) exp {“27» (*-«*)*}• 

Then 


L (*»• . .. 1 1 1 ' exp { 27* f 

•=(t 7(27)F exp {“57 [ £ ^-^+" ] 

=cxp {“27 exp ( _ £*)| 




Claim. The likelihood function L is factored into parts such 
that one contains x and fi and the other does not contain ft. This 
satisfies the condition for sufficiency : 

L | fJi)=g (x I /t» k {x .*„). 

In conclusion, x is sufficient for /x when a" is known. This 
proves (a) 

From (1), we find 


/ I \ n/2 ( n 1 " 1 

L (*.. *.*. I1*. ° 2 )= ( 2 77) exp { ~ 27* - V £,<*'“ ,l)2 ( x 

A' (.x lt A 2 ,••• , x„) 

where k {x v *„) = 1. This clearly satis ics the condition for 

sufficiency : 

L x M | o*)=g (/ | a 2 ) £ ( Vi. x n ) 



202 


Mathematical Statistics 


where /=— 2? (x,— /a) 2 , and /a is known. This proves our asser- 

n j-i 


1 n 

tion that — 2 ( xj—l l ) 2 > s sufficient for a 2 when p is known. This 

Tl jnl 

completes the proof of (b). 

(2) indicates that the first factor on the right contains o 3 ant 
the second one contains both a 2 and <r 2 . 

Hence the condition for sufficiency : 

L (*!...*» | a 2 ) —g (t | a 2 ) k (x„..., x„) 

dees not hold good. This asserts that r=s 2 =-j- 2 (xj-x) 2 is not 

j 

sufficient for a 2 alone. This proves (c) 

Again (I) provides : 

L I ° 2 )=(;^))" e*P {-r(v) 1 ( *r-MW»} x 

At (Xj,. • • p Xn) 

where k (Xi,..., x n ) = I. We observe that the condition for 
sufficiency : 

L (xi..., x„ I o for known p)=g (t I a) & (x,..., x„), 

where /= r (*7” m)*}'*** ho,ds fi ood - This P roves ( d )- 

To prove (e), wc recall the joint distribution of x and s- in 
normal samples : 

g (x, 5 * | n, o-)cc -i- exp ^ (x-/a) 2 J* . —j 5"" 3 exp }•» 

where 2 (x— h) 2 =wj*+w (x-jiV*. Then we claim that 
J 

L (.Vj. x„ | n, o°-)=g (x, s 2 | y.0 a 2 ) k (x lt ~., x„), 

and hence x and a 2 are jointly sufficient for ,a and a 3 . This 
proves (e). 

Ex. 8. In estimating for the parameter A in the Poisson distri¬ 
bution defined by 

f(x | A)=e A j 7 » 
x is a sufficient estimator of A. 


Estimation 


203 


For, evidently 


n 

T. 


L • • »> x n | A) c 


> j=* 



which may be put in the form 

L x,. I A)=g (t | A) k (*!>•••, An)^ 

with t=x. The condition for sufficiency that t=x be a suflicien 
estimator for A therefore holds. This proves our assertion. 

Ex. 9. The statistic t=x determined from a random sample 
is a sufficient estimator of the parameter p of the normal distribution 
N (p, o) with known standard deviation and this estimate is also the 

most efficient ? 

In view of Ex. 7, t=x is a sufficient estimator of p when a is 
known. From (2) Ex. 7 gives : 

g (a | /x)=exp |-^r 2 


from which we obtain 

a log (x | p) 


a 


dp o 


Hence the mean of the sample drawn from the normal popu- 

. , d In _o- 

lation is the most efficient estimator ol p with variance ^pJ n 

Therefore Var (r)=o 2 /n is the minimum variance ot a regular 
and unbiased estimator of the parameter p from the random 
sample of size n drawn from a normal population. 

Ex. 10. Proposition. A sufficient estimator is most efficient, 
provided that a most efficient estimator exists. 

Proof. Assume that fi is a sullicient estimator fi and r 2 is 
any other estimator, and further assume that the joint distribution 
of t% and t 2 tends normality for large n, say in the form 

r 1 \ ( f i — °' 2 _2? Ui-Mti-0) 

dFx exp. [-j^TTpr,- | v] 


VWa) 


+ ( ^}] "■ d ‘> ... 


( 1 ) 



204 


Mathematical Statistics 


where v^Vfa), v 2 =F(/ 2 ) The sufficiency of t x implies that the 
distribution of / a | t x does not contain 6. Now the distribution of 
t x is 

-0V 

^ Mm w 

...( 2 ) 


dF x cc exp. j-£ •-- 1 Vi ^ j dt x 


and hence conditional dist of t% ( t x is 

dF ** exp - [~ 2 -(r^)| fi Tr L - 2p 


(ti-b) (t x -0) 


4 


\Z( V lV 2 ) 


a 


rvn r 1 f? 1 {t x -ey r/,-*)(/,-*) 

- cxp - [- To=?yi t x 2? vtw) 


4 


r i \?(t x -e] 
exp [-2 { i-fTT~V^ 


?(t x -e) _ (t 2 — 9) I* 

V v 3 


f 


}'] * 

...( 2 ) 


This factor will not contain 6. Hence we must have 

?IV v i=i/V v 2 > ie - p= s/vJV^V^ 

where E is the efficiency of t 2 . 


The fact that a < / implies that v x < v a , which in turn implies 
that r, has a smaller variance than any other estimator. This 
proves our assertion. 


Fx. 11. Assume that x, ond x 2 form a random sample from a 
normal distribution N (/*, 1). Then x x + x 3 is the sufficient statistic 
for p. 

For, evidently, 

/(x I | ex P- 

Let a 1 = x 1 4 x 2 . Applying the transformation 


X x = otj—a 2 
Xa= » 2 » 

We have the Jacobian 

<-'■ (*i» -v 3 )_ 


t 


1 -1 

0 1 


1. 


C («j, a 2 ) 

which implies that dx x d.\ 2 =dx x dr% 

Now L (x lt Xj | fx) 

= (vcko) exp ' t-H',-c) J -K-v.-rt 2 ] 

which, in view of (I), is transformed to. 



Estimation 


205 


E («!. X«) I fl) 

^(v (2n)) exp ' (<*1 ao p) 2 i (»2 —/0 2 ] 

= ^ cxp - [ £ (( a i p) 2 2a 2 (a!—/i)-f a 2 *+(a 2 —f*) 2 }] 

= 2 ^ CXp - t“H(ai -^ 2 + M 2 }] (exp. {-(a 2 2 _ ai a 2 )}] 

= g (“l— h) k ( a i» « 2 )* 

Hence the condition for sufficiency proves our assertion. 

15‘7. Method of Estimation. 

The requisites of a good estimator have been discussed in 
detail in the provious sections. Now we wish to outline in brief 
some of the important methods to obtain such estimators. The 
methods commonly used are as follows: 

(a) Method of Maximum Likelihxod Estimator, 

(b) Method of Least Squares. 

(c) Method of Minimum Chi-Square. 

(d) Method of Minimum Variance. 

(e) Method of Moments. 

(f) Method of Inverse Probability. 

In the following sections, we shall discuss briefly the me¬ 
thods : first and last but one only. 

Maximum Likelihood. 

We shall confine ourselves for the most part to the case of 
samples of n independent observations from the same distribu¬ 
tion. Recall that the joint probability of the observations, regar¬ 
ded as function of a single unknown parameter 0 , is called the 
Likelihood Function (abbereviated L.F.) of the sample, and is 
written 

L(x | 6)=f ( Xl | | 9)...f( Xn | 0 ), ...(1) 

wheie/(i | 0 ) is written indifferently fora univariate or multi¬ 
variate, coutinuous or discrete distribution. 

The Maximum Likelihood Principle. The ML principle is 
simply as follows ; 

If L[x | 0)=f (x j j 6)...f (x n | 6) is the frequency function for a 
ran cm sample of size n drawn from a population with an unknown 
parameter 0, then the maximum likelihood estimator of 6 is the num 

her 0 within the admissible range of 0 which m^kes the LF as large 


206 


Mathematical Statistics 


as possible. That Is we choose f such that for any admissible valuee 

L(x | (?) > L(x | 6) " (2) 

If the range off (x I 6) is independent of 0 (or if/(* ) IS 

zero at its terminals for each 0), and 6 may assume any real value 
in an interval (which may be infinite, in either or both directions) 
stationary values of the LF with the interval, if they exis , 

given by the roots of 

r<, ig 1 ' J -° ••■<» 

A sufficient (but not necessary) condition that any ofth 
stationary values (say. 9) be a local maximum is that 

I" ^ ( ^ < Q ...(4) 

In order to find all the local maxima of the L F. in this5 way 
(if they are more than one, choose the largest of them), 
to find the solutions of (2) provided that there exists no e 
maximum of the LF at the extreme posible values of 3. 
the LF is twice-differentiable function of 9 throughout its r 

To work with the logarithm of the LF is generally simpler 
than to work with the function itself. Then unJer the conditions 
of the last paragraph, they will have maxima together because 

f) 1 BL 

( log L=L'/L= 


dd m ~°“ L B» 

and L > 0. Hence we wish to find the solutions of 

(log £)'= 8 


for which 


89 


a 2 


log L= 0, 






(log L)"= dd2 (log L) < 0 

Note that ( ; ) and (6) arc easier to solve than (3) and (4). 

We call (5) the likelihood equation 

Theo rcm 13. If a single sufficient statistic exist for 9, the ML. 

estimator of 9 must be a function of it. . . f a Th ._ 

Proof. Assume that t is the sufficient statistic for 9. Then 

sufficiency of t for 9 implies the factorization Oi the LF 

L( x 1 9)=Uif(x, 1 9) 

That is, L(x I 9)- g {t I 9) k a), —if) 

where the secor.d factor A:(.\) on the light of (7) is independent of 


Estimation 


207 


6. Hence choice of 0 to maximize L(x | 0) is equivalent to picking 

* t0 ina *imize g (/ J 6), and consequently, $ will a function of t 
alone. This completes the proof. 

Theorem 15. The likelihood equation 

h '°g L = <lo s L )'=° m 

always has a unique solution , and it is the maximum of the LF. 

Proof. The condition of sufficiency of single statistic / for Q 
implies that 


Then 


L{x | 0)=g (i | V) k(x). 

- eg L ( x I lo s (' I ®). 


and hence the LF is of the form in which the A IVB estimation of 
some function 6 is possible. Therefore, the LF is of the form. 

Q 

M (log L)=A(0) {t-W)) 

where F(t)—ij>(d). Then the solutions of (8) are of the form 


...(9) 


t=4> (0) 

Differentiation of (9) with regard to 6 leads to obtain 

a 2 , 

d j* (i°s l)=a\o) {t-W))-AW (0) 


...( 10 ) 


But 


m-m 

K A(0) 


...(11) 


by virtue of which, 

-a( 0) t'(e)~-{A(d))t v{t) 


...( 12 ) 


An examination shows that at S the first term on the right of 

(11) is zero in lieu of (10). Hence (II), by dint of (12). assumes 
the form 


(log I.)” =-{A(6))t V(t) < 0 ...(13) 

Hence by (13), each solution of (8) is a maximum of the LF. 
The reader must keep in mind that under regularity conditions 
there must exist a minimum between two successive maxima. As 
amatter of fact, there exist no minimum, and hence there can not 

be more than one maximum. This implies that the solutionis 
unique. This completes the proof. 



208 


Mathematical Statistics 


Remark R x The uniqueness of the solution of (d/89) log L—0 
is obvious from the uniqueness of the MVB estimator /. 

R a . Where Z.MVB (unbiased) estimator exists, it is given by 
the ML method. 

Theorem 16. The ML estimator 5 obtained by minimizing the 
range {of f (x/9)} depending upon 9 is unique when a single sufficient 
statistic exists. 

Proof. Assume that the range of f (x \ 6) depends upon 9. 
Then a single sufficient statistic only exists if 

f(x 1 0)=g (x)ih(e) -(* 4 ) 

Then the LF is of the form 

L{x | 9)= n g ( xj)l{h(9 )}» —0 5) 

/= 1 

Obviously (15) is as large as possible if h ( 9 ) is as small as 
possible. (14) furnishes : 

1 = f f (x\ 9) dx={ g (.x) dx/h (9) 

where the integration is taken over the whole range of x. Therefore 

7/(0) =J q (x) dx. ...(16) 

which claims that to make h(9) as small as possible we must pick 


0 so that value of the integral on the right (one or both of whose 
limits of integration depend upon 9) is minimized. 

Recall that a single sufficient statistic for 9 exists only if one 
terminal of the range is independent of 9 or if the upper terminal 
is a monotone dacreasing function of the lower terminal. In either 
of these events, we see that the value of (16) is a monotone func¬ 
tion of the range of integration on the right-hand side consistent 
with the observations and the value reaches a unique terminal 

minimum when the range is as small as possible. This proves 

• 

that the ML estimator 9 obtained by minimizing the range is 

unique, and the LF (15) ha a terminal maximum at L(x | 9). 
Illustrative Examples. 

Ex. 1. For binomial distribution with density function 

f{x | p)=p* q'~*, x= 0, 1, q= I —p, 
the maximum likelihood estimator for p is the sample mean , and it 
is also a sufficient estimator. (Bombay 1967, B. Sc. 1967) 

For, definition of the likelihood function LF provides : 


_ > , „ xj l—xj Sxj n—Zxj 

L(x\p)=Up J q J = p 1 q 1 


.*.(») 



Estimation 


209 


Then lo# L (x J p)=(27xy) log p 4-(/i —27x,-) log q, and hence 

k - 1 <-' »-?■ 

The likelihood equation : 

8 


dp 


log L (x | p) = 0 


yields 


i.e. 


q 27 Xj—np+p 27 X ;=0 

27 x y — np=0, 


which furnishes : 


* 27 xj , _ 

p =ir +x ' 


this asserts that x is the ML estimator for the parameter p. 

To show that x is sufficient we shall demonstrate that the 
conditional distribution of the Xy gives x is independent of p. Set 

27 xj = nx*=y. As a matter of fact the marginal distribution of 
nx=y is given by 

(y y^ y (2) 

The conditional distribution of the x } given y is obtained by 
dividing (l) by (2) to find 

k (*I» •••, X n I p)= - --- , Xy=0, 1, 27 Xj=np, ...(3) 


'np i 


a distribution which is independent of p. 

Ex. 2- For b nomial population with density function 


1 />)=(”) w*-’. 


1> 2,..., n, q —1 p 


the ML estimator p is x/n and V(p)=pq/n. 
For, evidently 

U* | n)=nf{x, | P )=n l n ). P Sx > 

j j 


(Agra B.Sc. 1969] 


so that log L (x | p) = Iog 11 ^ + (£x,) log p 


4-27 (n—xj) log q. 

J 


..( 1 ) 



210 


Mathematical Statistics 


Then the likelihood equation 

3 


dp 


log L (x | p)=0 


yields: 


Zxj 2 (n—x,) 
• • 

J 7 


= 0 , 


...( 2 ) 


from which we obtain. 


n 


r.e. 

which furnishes 


2 x ,=p 2 n=n z p 
j=l J 
nx=*n 2 p 9 


+ x 
p =_ 


_ 1 g g _ npq __pcf 
~~ n 2 n 


n 


9 


n 


Then V (P)=r(£-)=^ V (S) 

This proves our assertion. 

Ex. 3, For Poisson's distribution with density function 


re i \\ —A A® 

fix | A)=e 

A — ^ ^ 

Me A/Z, estimator for the parameter A /j \=x and V (A)= — * 

For, the probability of n independent observations from a 
Poisson distribution is 

v rt 

Z(*| *W~ nA n 

j=\ 1 

Then log Z, (a: | A)=» —nA+27 x } log A—27 log xj l, 

• • 

J J 

which, in view of differentiating with regard to A, furnishes : 

$ £ 

log L (x 1 A)=—nA-f — 27 x Jy 

which when equated to zero gives 

$=££'=*. 

n 

/ — 

This shows that the ML estimator for A is \—x. 

Evidently V (A)= V {x)=a*/n 

A 


Estimati on 


211 


The reader will convince himself with verifying that V (— 

\ A d\ I 

(—+f Sx )- V (t-* x ') 


27 V (x;) = n\=-~- so that 


V{\) 


n 


Ex. 4. The ML estimate of the parameter 0 in the type III 
distribution 

. e~~ x i e 

dFss ~r ( P ) e*' 

where p is known , is t> = — and with variance V (0) = -—. 

For, we have 

„ Yf“l P~xf0 

1 «=T0**- 

which yields : 

log/(*; | 0) = (/?-i) log X;-^--log log 6. 

Then log L=—J- 27 xj—np log 0, on dropping terms inde¬ 
pendent of 0. 

The ML equation — log L=0 leads to obtain 

from which we obtain 


27 x; 

=A. 

np p 


Now log L 


27 Xj 


0 2 

nx 

02 

X 


0 


rip 

0 

0 


np\x_ 

0 \p 


0 


l 


0*l(np) 



212 


Mathematical Statistics 


This implies that 



This completes the proof of the problem. 

Ex. 5. The ML estimators of the parameters a and A ( large ) 
of the distribution 


f(x 1 “* A) = rw(i-) A cxp (~ir) xX 


ore x and 


AT5 I- *-5 H5rfl respe;tively ' 


J 


0 < x < oc 


We may use (djd\) log T(\) = log A—(1/2A) . /or large A. 

(Agra B. Sc. 1968, Poona 63, Bombay 1956) 
For, our h>pothesis gives 

log/(.x | a, A) = log F(A)+A (log A—log a)—(A*/a) 

-HA-1) log .x. 

Then log L (x | a, A) = — /j log F(A) + «A (log A—log a) 

— (A/a) 27 Xy-f (A—1) 2: log X; ...(1) 

J J 

Dropping terms independent of A and then differentiating with 
regerd to a, wt find 




Z x r 

% 

J 


Tha likelihood equation (A/tM log L (x/x) = 0 
yields : 



A 


+ — Zxj= 0 , 


from which we obtain 


» = (!/«) 27 x f =x (2) 

J 

Again dropping terms independent of a in (1) and then differen¬ 
tiating with respect to A, we obtain 

(d/c\) log L(.v 1 A)=— n log r(A)-f-n (fog A —log «) 

4n-(l/«) Z x f +Z log Xf. 

~—n (log A— (l/2A)}-f n log A — n log a 

-\-n— (I/a) Z X/ + Z lo. x, 


(n[2\)-n log x-\-Z log xj 


Estimation 


213 


Then the likelihood equation {djdX) log L (x | A)=0 leads to 
obtain 

(n/2A )—n log y + 27 log x,= 0, summation on j 
from which we find : (i/2A) = log x — 27 log x h 
This furnishes : 


A= 1/(2 (log x — 27 log Xj)} ...(3) 

and we are done. 

Ex. 6. Find the ML ettimator for 0 based on n obser vations 
for the frequency function 

f{x J 0)=(l+0) 0 > 0, 0 < y ^ 1 

(Agra B Sc. 1966 

We have 

log L ( x | 0)=n log ( 1 + 0 ) 4-27 log Xj), summation on 
Then (3/30) log L (y | 0) = O leads to obtain 

Y+ d =1 7 lo 8 Xj = \og {x x x^..x n ). 


giving : 0=— 1 — 


n 




n 


log (.Y 1 r 2 ....Y n ) ‘ z log Xj 

which is the requested ML estimator for 0. 

Ex. 7. For random sampling from a normal population, find the 
moximum likelihood estimators for 


(a) the population mean, when the population variance is 
known, 


( b) the population variance , when the population mean is 
known , and ( c ' the simultaneous estimator of both the population 
mean and variance. 

The frequency function N(p, a 2 ) yields : 

/<* I /*. °) = ^2io «p[-27^-A*) a ]] 

Then the LF is 


" )= (<vW)" CXP [' 


1 _ " 

2 j=l 


2a a 


)” cxp [ - ^ {x ~'' y+ ] 


so that log L (x | n, a) = C ~ log a 2 — {x — /t ) 2 + s 2 } 


where C is constant. 

(«) By hypothesis, n 2 is known, the ML estimator for /*. is 
given by the likelihood equation 



214 


Mathematical Statistics 


d log L (x \ p, a 2 ) _ d f n 
dp dp L 2 o 2 



which yields : x—p =0 i.e. p=x 

(b) Again our hypothesis reads that p is known. Then the esti¬ 
mator for o 2 is obtained from the likelihood equation 

h 108 L (x 1 '*• a ' )= M~T Iog 

from which we find 

-2-Jr+ 2 - 0 4 <*-/W}- 0 

This provides : 


s» 

J = l 

(c) The simultaneous estimation of p and a 2 furniihes : 

d/dp log L (x | p, a 2 )=0 
with d/da* log L{x \ p t a 2 ) gives 

A 

P=x 

and d/da 2 log L (x \ p. a 2 ) yields 

1 n i n 

£ (^-^) 2 =— z ixj-xy=s* 

° n j= 1 "jr=l 


In fact, x and (\/n) 27 (xj—x) 2 are the sample moments 

j=l 


/ A f 

corresponding to p and a 2 . The estimator p is unbiased but a* is not 
because 


a r\ —— ] 

E (6 2 )= o 2 

Remark. It is possible to estimate p without estimating o 2 , but 
it is not possible to estimate a 2 without first estimating p. In the 
one parameter case, ML estimators need not be unbiased. 

Ex 8 Find the ML estimator fot the parameter in the distri¬ 
bution 

dF(x)=dx/0 t 0 < * < 0 [AgraB.Sc.1972] 

The LF is L(x | d)=0~ n . 

whose maximization is obtained by minimizing our estimator 0 


The largest observatin .t 


('0 


satisfies 


•*u) ^ 0 


Estimation 


215 


is a fact. Hence the ML estimator for 6, which is also sufficient, is 

S=x <Q ) 

Note that 0 is a biased estimator of d. The reader will verify 
himse If that the modified unbiased estimator is 

f=(/i+ 1) x {n) fn 

Ex. 9. The density for samples of size n from the uniform 
distribution over the range a ^ x < P is 

_1 _ 

(tf-*)" 

Find the ML estimators for the parameters a and ft. 

For, the log LF is 

log L (X | a, ft)= — n log (P — a) •••(!) 

The likelihood equations to estimate z and P are 

(0/da) log L{x | a, ft) = nf(ft— a) = 0 
and (0/cp) log L{x | a, ft) = -n/$-z) = 0 

If we wish to solve the above equation, we shall find at least 
one of a, p must be infinite—an absurd result Reason ; the likelihood 
docs not have zero slope at its maximum, and we must locate 
its maximum by other devices. Not that the LF: 

1 

(a-P)* 

is maximized if (/I — a) is minimized. Assume that we have a 

sample of n observations x v ' 3 .. x„ in hand such that x u) and 

x <ft) are the extreme (smallest and largest) observations. This implies 
that a can be no larger than x<,> and ft cau be no smaller than 
X( n ) Hence the minimum possible value for p-x is x (u) —x U) . The 
ML estimators arc evidently 

A 

a —X U) 

P = *<n) 

This is some what curious result because we have made no 
use of the intervening observation xj,j=l, 2,..., n. 

Remark The extreme observations x (l) and x {n) are a pair 
of jointly sufficient statistics a and p (vide Ex. 8). In view of Ex. 8. 
x (l ) is individually :ufficient for a and x (n) for p. 

Ex 10 A random sample x u x 2 .. x„ is drawn from the 

exponential population with density function 

f{x | «, ft)—Vo ex P- {—P(x —«)}, a < x < co, ft > 0, 





21 6 


Mathematical Statistics 


where y 0 is a constant. Find the maximum likelihood estimation for 
% and (3. 

Evidently, f y 0 exp. { — P(x— a)} dx= 1 

J a 

provides : y 0 =b- 

The LF is then 

L (x | a, P) = ft n exp. {-[3 27 (Xy—a» —(O 

so that log L(x | a, /3)=« log /S— nf3 (x—a). 

The ML estimators of a and /? are roots of 

{d/dct) log L (x | a, f9)=0=w{3 ...(2) 

0/0?) log L(x } a, j8)=0 =(n/p)—n (x—a) ...(3) 

Equation (2) gives jS=0, which is evidently inadmissible, and 
by dint of which, equation (3) fails to find a finite value of a. We 
infer that in this event the method of differentiation fails. 

Hence we shall choose a and f3 such that the LF is maximized. 
From (1), we observe that whatever the value of (3 (>0) be the 
LF is maximized by choosing the value of a as largest as possible. 
But the condition : a^x<co implies that « must be less than or 
equal to each member of the sample. Hence the largest value 
which a can assume, consistent with the sample, is the smallest 
value of the sample, say x n) . Consequently, 

A A ] 

a = .Y(D and then 3=--- 

X-X {l y 

Ex. II. Independent samples of sizes n 1 and n a are picked 
from two normal populations with equal means p and variances res¬ 
pectively being Act 2 , a 2 . The maximum likelihood estimator of p is 
then given by 

P — (n'Xi -\-n. z x 2 )J^L-\-n^ 

and its large sample variance is 

V (^) = o 2 /(^- + ».). 

Further the unbiased estimate 

t = («!*, + n°x*)/(n i +"a) 
has efficiency 

A (n x + // 2 ) 2 /("iA + w 2 ) (rii +w 2 A) 
which attains the value 1 iff A = 1. 

Assume that t„,(i= 1, 2,..., n.) and x th (j=l, 2,..,, n t ) arc 


Estimation 


217 


two independent samples from the given normal populations 
respectively. Our hypothesis says that the above two samples are 
independent. The likelihood function of ( n l + n 2 ) observations is 
then 


L ( V(2»\o°-) ) 1 exp { 2XP-/£ l 

x (vi 2 ^-y) ' exp 

Hence log £=C-J- £ 

j=l 2a ‘j = l 

where C is a constant with regard to /i. 

The likelihood equation to estimate y. is 

S- log L=-aO 

which gives : 


2a a 2 ^ 2) (x,/ ,x) 27 ( 2) (.v 2 y —/t) —0, 

1=1 za y.-= 1 

yielding 

1 m w 2 

— 27 (x x < —/i)+ 27 (x 2 y—/t) = U 

A i=l j=l 

— "i (*1“/*) +" 2 ( x >— /0=0 


or 


(y+«a) = -^p- 1 + ^ 2 - 

Hence ,1--= (//iX,/A+w 2 .v a )/^- +// 2 ) 


Now 



K (n^xjAn^) 

[$-* ^ (*.) + «*= 



noting that the covariance term vanishes in view of independence 
of the samples. 

As a matter of fact, V (x A )=Aa*/"i and K (x 8 )=o 2 /// a . 



218 


Mathematical Statistics 


Ite> > ' (;i_ i^)'[" + ’(f) + ” , ‘K)] 


(¥H _ 

By hypothesis, t= n l- i ^~ nzXi , 

"i+ni » 

Then E (* 2 ) - ”^+”^ 

ni~\-ri 2 n x -\-n z 

=f i - 

This shows that t is an unbiased estimate of p. 

Now y {t )=V 

-<id«r [ni ‘ ’ /(5l)+ " 3 K( ^ ] 

= KT^[" ,S («t) + "* 2 &)] 

*(«iA+/i a ) o a /K+w 2 ) 2 

The M. L. estimators are most efficient (vide Theorem20 (/? 3 )). 
The efficiency e of */* is then given 

e _V(/ 1 ) __ (/i|+w ») 2 —-- 


V (0 (WjA-f-fJo)®- ^ - 1 + /i 2 j 
_ A (/tj + Hj'* 


(//jA-j-«a» (Mi + w 2 ) 

Letting e= io 

A ( ,J i + Wo) 2 =;(«xA-}-w 3 ) (ni 
which gives 

A («i 2 4-w* 2 ) + 2/7 i« 3 A=A (n 1 2 + w a 8 ) + « 1 /i # (A+l). 
Comparing coefficients provides : 

2A=A+l 

which gives A —1. 

Ex. 12. Let Xj y {j= 1, 2,..., n) be independent normal varia¬ 
tes with 

E (X j)=jO and V (xj)=j 3 o*. 

Then the maximum likelihood estimator of 0 is 



Estimation 


219 


A 

0 is unbiased for 0 and 


V(0) 


-l(M) 


In view of our hypothesis, 

1 


f <*'> exp { - 2-fe }• 

As a matter of fact, xj , (j = l, 2,..., n) are independent. Their 
likelihood function is then 


which gives 


log L=k>g 

where C is a constant independent of 0. 

Now the likelihood equation for estimating 0 is 
3 log L „ . 1 1 


i. e. 


Then 0 


7=1 l J- J 7=1 J 7=1 ■' 

■ -S (*>//“)/ 27 -. 
j=l I j = l “ 


l 

) 


Now £(»)=£ i 2' I E -! 

lj=,l /j=l J 

n j n i 

= 27 (j0/j 2 )/ 27 r = 0. 

j=l 'j=l J 

This proves that 0 is an unbiased estimate of 0. 
Now v(u)=v{ y| 


1 


n 


( e 4-V 


27 V (Xjlf), 


1 



220 


Mathematical Statistics 


where the covariance terms vanish in view of independence of xj , s , 
Therefore VW) 


= a 


1 

n 

: / n 

. Z 
1 \2 . 

z 

-A-) j= 

\j=\ 

J / 

1 

n 

/ 11 

1 \ 2 ^ 
-) 7=1 

z ■ 

\ 7=1 

J J 

t n 

7~ I Z 

J_ 

• 

] J= 1 

J 


< 3^2 


Ex. 13. Let a most efficient estimator A and a less efficient 
estimator B with efficiency e tend to joint normality for large 
samples. Then 

(i) ( B—A ) tends to zero correlation with A. 

(//) The error in B may he regarded as composed ( for large 
samples') of two parts which are independent, the error in A and the 
error in (B—A) 

m 1 

(i) We wish to demonstrate that 

r .i- (u-.i) — 0 =>■ Cov (A, B — A) = 0 



In fact, 

Cov (A, B-A) = E\{A-E(A)){(B-A)-E(B-A))] 
=>E[{A-E(A)} {(B- E (B))-(A-E (A))}) 

= E[{A~E (A)} {B-E (B)))—E ( A-E (A)f 
= Cov ( A , B)-V(A) 

= ?a.-\Gn — o'At 

where p denotes the correlation coefficient between A and B. 


Letting o,i = o implies 



But ? — \/e in view of theorem 9 
The Cov (A, B — A)=fv A 'jj } — o 2 a = \ r e-a- 



= a 2 — a-= 0. 

This shows that (fl— .4) has zero correlation with A, and 
hence (i) is proved 

(ii) As a matter of fact, 

B = A-\-(B- A). 


Estimation 


221 


Then V {B)=V[A+B-A) 

= V(A)+V(B-A) + 2 Cov (A, B-A) 

— V (A)+ V (B — A) since Cov (A, B—A) = 0 


This implies that 


by (i) 


Error in 5=Error in /f-f-hrror in {B-A) 

The result follows in view of independence of A and {B—A) 
(iii) Now V {A-B) = V {A)+V (B)-2 Cov {A, B) 

= <*// -f-CTfl 2 — 2 pa A (TB 



+ ” — 2 y/e-a 


a 

y/e 


This proves (iii). 



Ex. 14. A sample of size n is drawn from each of t'le four 
normal populations which has the sane variance a-. The means of 

the four populations are a+b+c, a+b-c, a-b+c and a-b-c. 
Compute the M. L. E for a. b, c and a*. 


Denote the sample observations by x,j (/=!, 2, ?, 4;y = l, 2, 
• ••t n Our hypothesis says that the four samples from the four 
normal populations are independent. The likelihood function L 
oi al! the sample observations x, )t (I, 2, 3, 4; j = l, 2,..., #i) is 


L [ay/{2n)) exp { 27%-f, -(0 

where m (/ 1, 2, 3, 4) is the mean ol the /th population 
(i) may be rewritten as 

L= [nV0n)) eXp [~2^ { j 

+ f + ^ (x„-p t y + E (x,j- Ht )*] 

J j 

1 hen log Z,=const — 2n log a* J 

2 a* f 2 ( x \)—a-b—cy-\-Z (x»i — a- b-\- c) 2 

J j 

+ S (x 3 y—+ c) 2 -f-27 (* 4 ;—<* + />+c) 2 ] 

J 7 J 



222 


Mathematical Statistics 


where log (—) is constant with regard to a , b , c and a*. 

' v (2 tc) / 

The M. L. £* for a, b , c and g 2 are the solutions of the 
simultaneous equations which are infact maximum likelihood 
equations for estimating a, b, c and a 2 . They are as follows : 


— log 71—0 

...(2) 



l_logL=° 

...(3) 

log L=0 

...(4) 

L l0 « i= ° 

...(5) 


(2) provides : 

~ f 2 (xu—a—b—c) (-2)4-27 (. x 2) -a-b+c ) (-2) 

L j j 

+ 27 (Xsj-a+b-c) (-2)4-27 (xj-a+b+c) (-2) J=0 

j J 


from which we obtain 

£ (* 1 j+Jru+*.i+*«)+» [(— a-b-cW-a-b+c) 

+ (—a+b—c)+(—a-\-b + c)]=:0 

n i 4 \ 

/. e. 27 27 x n )4-»(-4fl)=0, 

/=l\/=l / 

from which we find 


A 

= 


4/i 


4 n 
27 27 

/= 1 j= 1 


x {i =x. 


Now (3) furnishes 


or 


2<j* 


27 (.y lf - a -b-c) (-2)4-27 {x 2 i -a-b + c) (-2) 
• • 

J J 


4-27 (.v 3j -a+6-c) (2)4-27 (.r^-a+^-HO (2)1*0 

J j 

27 x i; 4-27 Xa ; —27 .v 3i —27 x Aj — 4nb=0 t 
• • • 

J J J J 



Estimali on 


223 


which yields : 

b = i [r f X *+ 1 T * *ni- 1 -£ x„-iZxJ 

where Xl denotes the mean of the /th sample. 

The same procedure applied to (4) gives : 

(5) provides : 

__2n 1 f 

a 2 + 2a 4 l . ( x i> — a — b ~c) 2 +Z (x SJ -a-b + c) 2 

J j 

+ 2 (. x 3 ,-a+b-cY+2 (r„-a-J-h-j- c )=l = 0 

j j j 

which gives : 


=i[ j 


? c) 2 + l 7 Jr^-S-S+O* 

7 j 


'] 


+Z (x 3j —a+b — c) 2 + 2 {Xij—a + b +c) 

Snm7A e r m 17 A ML estlmat °rs are consistent. 

Some Optimum Properties of Maximum Likelihood Estimation 

Z.L jel ^definition of the Likelihood function we have 

We shall denote f(x, I U) bv Ft The fallen#' ' " 

are made: * 1 ^ y /y ' The following assumptions 

(i) The derivatives 

~'°£- L and ?U£^ 

30 dO- 

true'value ^ ^ 9 in 3 r3n « e * “n.g the 

true value, and for almost all a. For each 0 in R 

i _ •* * 


,J * 1 1 < w I «jS 1 


< F* ( at ). 


where F, and F 2 are integrable functions on (-oc, oc). 
(ii) The derivatives exists and is sueh that 


3 3 log L 
do* 


(3) For eaeh 0 in R t 

8 2 log L 


< A/(a:), F{A/(a)} < k (a positive quantity) 


f°° / S 2 log T\ 

J-ool'-£</*=/(*) 



224 


Mathematical Statistics 


is finite and non zero, 

(iv) The range of integration is independent 0. Under these 
conditions (which are called the regular conditions) we have the 
following theorems without proof. 

Theorem 18. With probability approaching unity as n -> co, 
the likelihood equation 

a_/o*_Z; 

d9 

has a solution which converges in probability to the true value 9 0 . 


Theorem 19. {Due to Hizurbizar . 19 IS) Any consistent solu¬ 
tion of the likelihood with probability tending to unity as the sample 
size tends to infinity . 

Deduction. Under regularity conditions. as n increases, there 
is a unique consistent ML estimator. Equivalently a consistent 
solution of the likelihood equation is unique. 

Theorem 20. {Due to Crame'r). Assume that the first two 
derivatives of the likeilhood function with respect to 6 exist in an 
interval of 0 including the true value 6 0 , and assume further that 


(d log L 

bd 


(x I 0) 


)= 0 , 


...(7) 


dd 

exists and is non zero for each 6 in the intervat. Then the ML esti¬ 


mate 0 is asympototically normally distributed with mean and 
variance equal to t/R l ( 0 ). 

Remarks. 

R x . Under the regularity conditions the ML estimator is 
efficient. Reason : The preceding theorem gives the ML estimator 
an as>mptotic \ariance equal to the minimum variance bound 

v&>Ue[( 3 4 g -) 2= - 

R«. The ML estimaior is asymptotically sufficient Argument : 
The M \'B can be only attained in the presence of a sufficient 
statistic. 

R 3 . The theorem due to Crame’r simplifies for a distribution 
admiting a single sufficient statistic for the parameter. The MVB 
becomes simply 





Estimate 


225 


is attained exactly when"*? is unbiased from d and 

need 3 ^ Und , er th * conditioos sta ^d in the theorem 
to evaluate the expectation in this case. 

distribntUon^ ^ eStimJtor °f the starld ard deviation 


asymptotically 

There is no 
a of a normal 


/( - xia)=s ^Vu^r exp - ( _ 


2n ? 


Evidently, 


M 


2 x ; 2 


log L{x | a)~—n log a --- 

w 2 a 2 9 

dropping the constant term. 

Then W ,og 1 *) =-*/<*+2 7 Xj*/o*, 
Which gives the desired ML estimator 

for the parameter a. 

Moreover, 


— CO < X < GO, 

o z /2n. 


8f 

da 2 


J 0 2 Io 8 L (X I — nfa -— 3 2 Xj*/a* = E-( 1 _ 

y a v 


3a 2 


An application of/?, Th 20 leads to obtain as n in 

this completes the proof of the problem. 

The use of the Likelihood Function 


increases, 


■n , 1 J hC Z ' / ? S c , nShnned aS thc re P° sit ory of all the ‘information* 
i Samp e ’ that ,s ’ the contains all the information in the 
e P rec, seJy in the sense that the observations themselves 
nstitute a set of jointly sufficient statistics for the parameters of 
ny pro lem under consideration. That is to say, the functional 
. ° 1 e distributions generating obseivations must be decided 

or n/h * * can be used at all, wh ther for the ML estimation 
° crwise * In other words, the statistician must supply some 



226 


Mathematical Statistics 


y-arf ir - £ 

“S.T tackle .Item »«' <* r °“" d “ “ 

17 Method of Moments (due to Karl Pearson) 

' Denote the density function of the parent populate with j 

parameters «„ 0 = 1. 2 » 1 6 " . 

P If Vr' is ,he rth moment about orie ' n ’ then we haVC 

V/(.v | H 2 .*/) <**. 0=‘. 2 .-.» 

I. e. pi'=J x / (* I f ' >’ H/) < f- x 

yi 3 '=.J .x 1 /(x I Hi' Ha.-*.. Hy) dx 


**.( 2 ) 


• • • 


••• •** 

.x* f (.v | 0it 

Note that ui\ ua'***. (*/ are «h« functions of the 




Assume 0„ : 0 „={*. .*■} is a random sample of size n from 

the given population. The method of moments ^ consists ol 

solving the / equations given by (!) in terms of p»’. • . /b 
and replacing these moments p r ' (r= 1, 2...J) by the sample mom- 

ents. For example. 


=*■ (l'i > 1*2 i"** l L i') 

=55 0j (w/j', w/), (/=!, 2 

where w,' is the ith moment about origin in the sample. 

This is important to note that the method of maximum likeli¬ 
hood and the method of moments do not lead to the same esti¬ 
mates. No .j we are interested in finding the most general form ot 
the distribution under which they lead to the same estimate and 
their relative efficiencies. 

Now assume 0 f , (/= I, 2, 3, 4) are the four parameters and 
the maximum likelihood estimate is obtained as a linear function 
of the moments. Then 


~ log L=a 0 +a l £x+a i £x*+a 9 i:x a +a i £x A 
and hence 

/(x | 0 X . 0 3 , 0 4 ) = exp {bo+biX + biXi+b^ + btr**) •••(*) 

which is the most general from. 

Note that b's depend upon the 0’s. For the form (3), the 
method of moments provides maximum likelihood estimators. 



Estimation 

227 

!‘ er '/’ Sare com P uted under the assumption that the total 
frequency » un.ty and the di.tr,button funclion converges 

butiol thT.i" ° Uld r keep i ° m,nd tha ' in case of no ""al distri- 
method of ° SetS . of cst,mate s have the same variance. The 
method of moments is most efficient for b x =b 3 =b i= = 0. 

values^/^s d ° eS n ° f 8 ' Ve Pearson,an system of curves for other 

£et b l be zero and b 3 and b t be small compared with h 2 . Then 

// ■ 




2 b 2 x 


1 — — 3 — v « 

2 V b 2 A 


• • ‘'a 

which gives, on integration, the Pearsonian Types 
by the°metho U d n of r qU " C ee " eral COnditions - the esiimates obtained 

Examples. 

./*' l ; Es,ima,e ,he P arametg r V in sampling from Me l.ino- 
to tot population 


f (-*» n t p) — ^ ^ ^ p*q"~* 

by the methods of moments. 

Denote a sample of size 


*=0, 1, 2, ...,n 


f (•*■» n, p) by 


n from the binomial population 


, V n — (a,, X%, ., A n ). 

Then the sample mean is 


n 


x=0/n) £ xj 

a j==l 

As a matter of fact, for the binomial population 

Mean *=E(x)=np. 

Therefore the method of moment gives 


_ ^ V* 

np=x => p=±- 

n 


rlJtuiit E5 “ ma ' e * andP ‘ n " ,e ^ ° f ““"on's Type /// 


/(*, x*~‘ exp (-pjr), 0 < jr ^ oc 


By delinition 


/** 


A- 


] f a «P (-fix) dx 




228 


Mathematical Statistics 


_ f 
r(«) J 


oo 


x* exp (- $x) dx 


0a rfaj-J _« •/» _iL 

T(a)‘ 0* +1 P 

oo F(/i 4-1) 

MX- gfl+1 


...(D 


since f exp. (—px)x n 


and |x 2 , = f°° x*fix,a 9 fidx 

JO 

— _P* f°° * a+1 exp. (—fix) dx 

r( a) Jo 

P* r(«+2) («4- I) 

P* 

Now 

Then (V „«l5±*l/^= c ^ = - + l 
Tfa ^T*”" P~ /P 2 « * 

SO that —=*^5—1* 


...( 2 ) 


# # 2 


a Mi 


Mi ' 2 


which gives : 


«= 




'* 


From (1), we gnd 


I 4 '*—Mi # * 


...( 3 ) 


p=-a 


Ma-MV 


...(4) 


Then i-and 5= 

m 2 — m i* m * rn 1 

where ra'j and rn'% denote the sample moment. 

Ex. 3. The double Poisson distribution is defined by 


, m * .m," tl * 

/> (x)= J P{A'=x} = 4-T--H 


—m a 


^a* » 


X ! 


x ! 


x=0, I, 2,... 

Then, by the method of moments, the estimates form i Wm, 

me : 

m'i ± V m'i” mV) 

By definition, 

°° 

ft # ! (about the origin) = S *P (*) 

x=0 


= * n 
- x=0 


r " ffl ' .%• , i * x iT ma ' 

- +‘v=o *• 


x ! 


x ! 



Estimation 


229 


~l n h + 

Poil USe !r e fi K ' and SeCOnd summa "°" are in fact, the means of 
distributions with parameters », and w 2 respectively. 

Then K.=i h+m! ) ... (1) 


Now ^' 2 (obout the origin) = 27 x * p (*) 

* = 0 


00 -a n ~ m x 


{ OO 

x=0 

[ OO 


27 
a: = 0 


CO « 

a : 8 £ 


— m 


3 


r Wi 

t .#/i| 


.m * 


a: ! 


= .f f *“ w W ^ ~ 

S U=2 "(AC -2)1 +x=0 * 

— m 




7W, A 


.v ! 


+ 2 -n,,> '£ 

x — 2 (-*—2) ! at= 0 ,v ! J 


>[■ 


- /// 


2 ? . *—2 
1 ... 9 ^7 

• ^2 /% ---U m 

at =2 (a— 2 ) r 1 


-fe 


—mo „ 


OO 

.mo 2 27 


m a *-2 , 

t 2 (^2T!+ m 'J 


Note that 


00 A> 

27 rr-=^ 

7 = 0 • 

Then /^ a = £ [(/Wi-f-/w 2 )-f , /w 1 2 4-w a *] 

By (I>, Wi-f-m 2 —2 /a'j, 
m 2 =2/i',—m x 
Then (2) yields : 

^' a =£ [2/x* 1 -|-m 1 a -f-(2/T / 1 — m k ) 2 ] 

. a =H2/x , i + m 1 2 +4/x / I 2 4- Wl a_ Jm ' 

which gives : 1 

/ i, 2 = “f i 'i + w 1 2 +2/x' 1 , -2/i , I m 1 
or mi 2 - 2m.ii', + (V,* + M', - p' 3 l = 0 

Then »■ -. ?P'i- L {‘ f M'i , -4( 

2 -- 

= P'i±{/‘V-2 pV-h',+,/,}■« 


.w a * j 


...( 2 ) 



230 


Mathematical Statistics 


i.e. l2 ) 1 2 

Substitution : m 1 =2fi\—m% in (2) provi. es : 

m 2 a —2m 2 n \ 4- (2f* # i + f'i—/*) = 0» 

from which, we find 

The result follows immediatly. 

Exercises 

1 Explain the term consistency. Show that the mean of a 
random sample is a consistent estimator of a normal population 

2 Establish that (a) the sample variance m random sam 
pling from a normal population is a consistent estimator (b) the 
sample median is a consistent estimator for the population^ mean 

of N(p, o 2 ) and (c) in the Cauchy distribution dF (*)=— -y +(x _a)t 

the sample mean is not a consistent estimator but the sample 

median. . 

3. What is Fisher’s criterion for the best estimator / 

4. Explain the term unbiasedness, giving suitable examples. 
Show that the mean statistic is a unbiased estimator of the parent 
mean provided that the latter exists, but the sample variance is 
not an unbiased estimator of the parent variance. 

5. Assume x„ x„ is a random sample from a normal 

1 n 

population N (n, 1). Then r=— E *,* is an unbiased estima- 

r= 1 

tor of ft*-H 1. 

6. Let x lt x n be a random sample of size n from a 

normal distribution with mean 0 and variance a 2 then the statistic 

/c=_L 27 Xi * is an unbiased estimator for c* and has variance 

" i=l 

2 a l /n. 

7. The mean x of a sample of size n from the distribution 

/ (.v, 0)=-j- ex P 0<x<oo, 

=^0 elsewhere, 

is an unbiased estimator for 9 and has the variance 6*/rt. 

8. If 6 (x lt xj is an unbiased estimater for 9, 



Estimation 


231 


(i) Is U- an unbiased estimator of 0- ? 


(ii) ls\/0 a biased estimatcr of ? 

(iii) Is (1/0) an unbiased estimator of 1/0 ? 

9. (a) Define the desirable properties we look for the esti¬ 

mators Give one example for each. 

(b) Show by an example that a most efficient estimator is not 
necessarily unbiased 

(c) Show that for samples of size n drawn from a nor¬ 
mal population with mean n and variance a 2 , the statistic 
* _ 1 n 

,x n+[ f ._£ ( is most efficient for estimating/x, though it is 

biased. 

(<i) Given an example to show that sample mean is not nece¬ 
ssarily always most efficient estimator of the population mean 

10 (a) Define a consistent estimator of parameter 0. Give 

an example to show that a consistent estimate need not be 

unbiased, (b) For the Poisson parameters show that l/e is . 
consistent estimator of 1 / 0 . ' 

11. Give an example of estimators which are 

(I) Unbiased and efficient, (ii) Unbiased and inefficient. 

(iii) Biased and inefficient. 


12. Explain the term likelihood function of a sample of „ 
independent observations of continuous as well as discrete popu- 
lati n. Then establish Crame*r-'<ao inequality. 

13. Show that (a) the mean of a random sample of v. x y 
from A'( 0 , o*) t g is known, is M. V. B. estimator of 0 * with 
variance o*//i. (b) there exists no MVB estimator of 0 in 


dF (*) = J- 


<lx 


. — co^xs^oo 


- I +(x-0)* 

(c) the mean of a random sample from the 
bution 


Poisson’s distri- 


f (x | 0) || e-' x* | x !, x = 0, I. 2,...,00 
is the MVB estimator of 0 with variance 0/n 

(d) rfn is the MVB estimator of 0 with variance 0(1- 0 )/„ m 
the binomial distribution, for which 

L (r /0) = ^ } 0' (l-P)*-', r = 0. 1, 2, n 



232 


Mathematical Statistics 


14. If a minimum variance estimator exists, it is always 
unique, irrespective of whether any bound is attained. 

15. What is the concept of efficiency ? Examine the efficiency 
of the median as an estimate of the location parameter of (i) the 
normal distribution and (ii) Cauchy distribution. 

16 If is the most efficient estimator of 0 and t % another 
estimate of efficiency ‘e\ show that 

Var (fi—/ 2 )=( j— 1 

17. Let x 2 ,..., x„ denote a random sample from the 
population which has p.d.f. 

f( x , e)=e* (i -oy-\ x=o, l, o< 0 <i 

=0, elsewhere. 

Then the statistic x=— Zxi, is a sufficient statistic for 0. 

n 

18. Let x lt x 2> ..., x n denote a random sample from a distri¬ 
bution with p. d. f. 

/(*, e)=6x-\ 0<x< 1, 0>O 

= 0 elsewhere 

Apply the factorization theorem to show that the product 
n 

II x/ is a sufficient statistic for 0. 
i = 1 

19. If, a 2 is known in random sampling from normal popu¬ 
lations, x is a sufficient estimator for p; but if p- is known, 

S 2 =-i- 2?(xi—x) 2 is not sufficient estimator for a 2 . 

1 n 

Is S 2 =— 27 (x/—fi)* sufficient for a* ? 

n i=l 

28. A sufficient estimator is most efficient if a most efficient 
estimator exists* 

21. What is the maximum likelihood principle ? Show that 

0 

the likelihood equation — log L= 0 always has a unique solution 

CU 

and it is the maximum of the L. F. 

For binomial distribution with density function 
/(v I p)=p x q l ~ K , x=0, 1, \-p 

the maximum likelihood estimator for p is the sample mean, and 
it is also a sufficient estimator. 




Estimation 


233 


22. A sufficient estimator is most efficient provided that a 
most efficient estimator exists. 

23. Apply the method of maximum likelihood to find (a) an 
unbiased estimator of /x when a is known, (b) an unbiased esti¬ 
mator of a when p- is knawn and (c) simultaneous maximum 
likelihood estimate of ^ and cr. 

24. For random sampling from a normal population, find 
the maximum likelihood estimator for (1) the population mean 
when the population variance is known, (iij the population vari¬ 
ance when the population mean is known, and the simultaneous 
estimate of both the population mean and variance. 

25. State the properties of maximum likelihood estimators. 

26. Apply the method of moments to estimate *p* in the 
sampling from binomial population : 

/(*, P)—nc 9 p x <?"-•% x=0, 1, 2, 



16 

CONFIDENCE INTERVAL 


16 a l. Introduction. 

The problem of estimation is to estimate the parameter 0 oy 
a function which, for a specified sample, gives a unique estimate. 
We shall specify a range in which 0 lies. Three methods, of which 
two are similar but not identical, are as follows : 

(1) The method of Confidence Intervals relics only on the 
frequency theory of probability without imparting any new prin¬ 
ciple of inference. 

(2) The method of Fiducial Intervals explicitly requires 
something beyond a frequency theory. 

(3) The third method "depends on Bay’s theorem and some 
form of Bay’s postulate. 

16 2. The Neyman-Pearson Theory of Confidence Intervals. 

Assume O n : (x lt x 2 ,..., x„) is a random sample of size n from 
a given population with density /(*; 6 ), and further assume that 

S (.Yj, ..., x n ) or o ( O n ) is an estimator for 0 , an d g (0; 0) the fre¬ 
quency function of 0=40 (x u ...,x n ). We denote further by P{S 9 0 ) 
the joint probability function of the sample variables x lt x 2t .. ,x„. 
Let 6' be an arbitrary number. This, when substituted for 0 in 

g(0, 0), generates the completely specified distribution of 0. Then 

we are in a position to make probability statements about 0. 

For each fixed 0, the frequency function g (0, 0 ) defines 

* 

the probability distribution of 0, which may be interpreted as a 
distribution of a unit of mass on the vertical through the point 

(0. 0) in the (0, 0) plane. For any value of 0, we can determine two 
quantities h x = h x (0, c) and /; a =/; 2 (0, e) such that 

P{6 < /;,}= [' h g (?; 9) rf«= £l 

J - oo 

P{0 > /ia}*=»J //3 g (0; 0) d!f=c a 



and 


..( 1 ) 


Confidence Interval 


235 


Then we have 

f 00 * • * fA, * * 

1= g (9; 9) dt>= 1 g (9; 0) dU 

J - 00 J - oc 

J //., A A f ^ A A 

/; “ s (0; 0) dit+ J /; £ (0; 0) 

from which we obtain 

f‘ {Oh 0. ‘) < « < *2 (», e) I »} = J* s g (i) rf9= 1 -e, .. (2) 

where €l -f €;J = e . 

As 0 varies, the points (h lt 0) and ( h 5) describe two curves 

in -he plane of (0, 0) and in most cases each of these two curves 
will be intersected by a straight line parallel to the axis of 0 in a 
single point only. 



The functions /z x (0, c ) and h 2 (0, *) may be plotted against 
as in the above figure. A vertical line through any chosen values 
O' of 0 will intersect the two curves in points, which projected on 

A A 

the 0 axis, will provide limits between which 0 will fall with pro¬ 
bability 1 — €. 

After constructing the two curves 

6=hi (tt y e) and 0 = h 2 (0, e). 
we can construct a continence interval fot 0 as follows : 

Draw a sample of size n and compute the value of the esti¬ 
mator, say 0'. A horizontal line through the point O' on the 0 
axis will intersect the two curves at points which may be projected 

on the 0-axis Label these points as d y (0, c} and r/ a (0, «)). De¬ 
note the domain situated between the curves h y (0, c) and //., (0, c ) 
by D(e) Then we have the following three relations in hand : 





236 


Mathematical Statistics 


(0, 6) C D(e), h x {9. c )<§< h 2 (0, € ), d x ( 9 , e) < 6 < d 9 (0* *) 

...(3) 

Note that for any fixed value of 9; each of these relations 
hold by a set of points x=(xi, x n ) in the sample space. All 

three relations reveal the fact that the point (0, 9) belongs to the 
domain D(e). This implies that the relations are perfectly equiva¬ 
lent. In conclusion the sets in (3) are identical, and from (2) we 
find for each value of 9 

P{d x (9, e) <0 < d 2 (9, c) | 9}=»t—e -*(4) 

where d 2 and d z are functions of 9 and therefore of the sample 

space because 9 is a statistic computed from the sample observa¬ 
tions. 

Thus we have the following 
Definition. The random interval 

5(0*) :d x < 9 <d 2 

is called a confidence interval for the parameter 9 with regard to 

1 —c (which we call the confidence coefficient) or c (which is defined 
as the confidence level). The quantities d x and d% are the corres¬ 
ponding confidence limits. 

A reformulation of the above definition of confidence interval 
for 9 is as follows : 

Assume that we have a distribution /(.v, 9) depending on one 

parameter 0, and two functions : ^(O fl ) and 0(O«) which depend 

on the sample O n : (x lf * 2 .*„) but not on 9 such that the 

interval 

8 (0„) : 0(0,,) < 9 < f {O n } 

is a rendom interval. Then the probability that the random inter¬ 
val 5 between 0 (O n ) and 0 - (O„) includes, or covers the true of 

the parameter is 1— *, i.e 

P {5 (O n ) : 0 (O n ) < 9 < * (0„) I 0}=1 -c, 

whatever the true value of 9. We call 5(0*) a confidence interval 

and o (O u ) and tT(O n ) are termed as the lower and upper confidence 


Confidence Interval 


237 


limits. They depend only one 1 -e and the sample values. For any 
fixed 1—e , the totality of confidence intervals for different samples 
determines a field within which d is asserted to lie This field is 
called the confidence belt. 

Example. We wish to clarify these ideas by considering the 
following illustration. 

A sample O a : ( 1*2, 3 4, 0 6, 5*6) of four observations is 
drawn from a normal population with unknown mean n and 
known standard deviation 3 

Evidently x=i (l*2-f 3‘4-f0*6 + 5*6)=2-7 ...( 1 ) 

The ML estimator of n is the mean of the sample observations 
Now V (x)=<j 2 /rt. Then 

y=(x-,x)/(a/ x /n)=(x- t x)/( 3/2) ...(2) 

is a standarized normal variate with zero mean and unit variance. 
We can compute the probability that y will be between any two 
arbitrarily chosen numbers : 

P {-d€<y<de}= V ^ /(>•) dy, 

where /(>-)= __ exp (—|y 2 ). Then for preassigned number 
0 95, we have 


i. e. 


P{-V?6<y< I 




r i 96 

• 96, =)-, 


96 


/(>’)</> = 095 


*<F<' 


•96 |=0* 


95 



i e. P {/x—2*94 ( = l 96 x £)<A*</x-f-2-94}=0 95, 

which, on inverting the inequality, assumes the form 

P{x- 2*:4< m <a*+ 2 94}=0 95 
With x=2'74, we have 

P {-0 24</x<5 64} = 0 95 
Thus we have obtained two limits : —0*24 and 5 6 4. 

This implies that these limits ensure 95% certainty to contain 
the true parameter value between them. 


Now we wish to examine the meaning of (5) very carefully. 
( ) reveals that \i is the variable and this implies that *he probabi¬ 
lity that the variable lies between —0 24 and 5 6. is 0 95. Th»s 
version is incorrect. 

Reason, p is a fixed number, the mean of the population 



238 


Mathematical Statistics 


sampled. Moreo\er, the true mean p either does or does not lie 
between —0 24 and 5o+ The following is the only correct 
probability statement in this event is 

P {—0 24o<5 64}= 1 

provided that p is actually between the numbers —0 24 and 
5*64, or P {—0 24<fi<r64}=0 

provided that p does not lie between —0'24 and 5*64. However, 

to interprete (5) has some meaning. 

The statement in (4) does have meaning : the probability that 
a random interval , *-2 94 to *+2 94, covers the true mean is 
0 95. This implies that if samples of four were repeatedly drawn 
from the population, and if the random interval .v—2 94 to 
*+2 94 were computed for each sample, then 95% of those inter 
vals would be expected to contain the true mean p This claims 
that we must have a considerable confidence that the interval 
— 0 24 to 5 64 does cover the true mean. 

Conclusion. The measure of confidence is 0 95 because before 
the sample was drawn, 0 95 was the probability that the interval 
we were going to construct would cover the true mean. That is 
to say, in (4) the number 0*9 3 is a true probability but the reader 
must have in mind that in (6) 0*95 is not a true probability inspite 
of the fact that it is a measure of confidence in the truth ot the 
statement on the left hand side of (5). 0*95 is called the confidence 
coefficient or the fiducial probability which is distinguished from 
our ordinary concept of probability Then (5) assumes the torm 

P F {—0-24<u<5-64}=0-95 —l 6 ) 

which reads that “The fiducial probability that the interval —0 24 
to 5 64 covers the true mean is 0 95”. The word fiducial reveals 
nothing more than that the probability associated with the given 
interval was 0 95 before the sample was drawn. 

The interval —0 24 to 5 64 is termed a confidence interval t 
more specifically a 9 % confidence interval where the confidence 
coefficient or licucial probability is expressed as a percentage. We 
are fiee in constiucting intervals with any desired degree ot confi¬ 
dence. 

16 3. Confidence Inter*al for unknown mean p, when o is kuown, of 
a Normal Distribution 

Assume that O n : (* lt **..., *„) is a random sample of n 
independent observations from a normal population A' (p, c-). 
This implies that each */( j= 1, 2,..., n)~A'(fx, a-) so that * (the 


Confidence Interval 


239 


sample mean)-A^ (po 2 /n). Then x~n)/ o!n)<^N (0, 1). 

Therefore for the confidence coefficient e, we have 

P{—d <(X—ii)/(c\ / n)<d ) = l- e 
e € 

Where 5^00 e ~ i,! d,=[ fr °r ^ e ~ l ' % <*=!-« 

That is 

which, in view of converting inequalities, leads to obtain 

which provides the desired confidence interval for ^ when a is 
known. Thus the requested 95% confidence interval for the mean 
^ °f a normal population is given by 

t-'-K f n<l «x+l-96 ±. ...(2) 

The graphical representation of confidence interval given by 
(2) is depicted in the following figure 



The total zone (shaded) between the lines is the confidence 

belt. 

16 4 Confidence interval for unknown n, when a is unknown, of a 
Normal distribution. 

We have 

O n :(x 1 ., *„) 

x^N (n, a*), where a is unknown. 

Tqe same mean x=£ x,Jn (j=\, 2,...n) and the sample 




240 


Mathematical Statistics 


1 


n 


variance s*= — 2 {x^—xf would be computed from sample 

" j =i 

values. But a is not known. Therefore the limits for p cannot 

be computed. In fact, substitution of an estimator a instead of a 

is possible. Then the probability statement would no longer be 

exact, and might be wrong for small samples. Recall the statistic 
t defined by 


t=n- 


=(x—n)Ks/y/(n—l)) 


{2 (x— x)*/n 

This quantity involves only the parameter p and has the / 
distribution with n — 1 degrees of freedom which does not involve 
any unknown parameters. Hence we have a possibility to find a 
number, say t p such that 


P{-t p <t<t p ) 


where /?=«/! 00 

This provides : 




/(/; /i— 1) dt= 1 —e. 


/. e. 




\/(n-l)<r 


<x</i+r 


|=1— 




1)-- 

which, on converting the inequalities, we find 

Therefore 93% confidence interval for mean n of a normal 
population when a is not known is given by 

X ~‘ 05 V(«— 1 )<M<* + < o. - (4) 

Note that the number r. 0b is called the 5% level of t and 

locates points which cut off 2 5% of area under/(«) on each tail. 

Symmetry off ( t) about f = 0 implies that (4) gives the minimum 
95% confidence interval 

16 5. Confidence Intervals for the Variance of a Normal Distri¬ 
bution 

Assume that O,, : (.y„ x t ..., *„) is a random sample from a 
normal population with mean /x and variance a 3 . The statistic 


n 


V= 1 


^ ( xj-xy 


wS a 


Confidence Interval 


241 


where x is the sample mean, is distributed in the Chi-squared 
form with n degrees of freedom. For assigned confidence coeffi¬ 
cient 1—e, we can set up a confidence interval by determining two 
numbers X a 2 , X. 2 2 (say as a central interval) such that 

i-«, 

i. e . /• j }=!-<, 

which, on converting the inequalities, leads to obtain 


inS 2 

I V 




2511 - 1 - 

V f 


...(5) 


Therefore 95% confidence interval for a 2 of a normal distri¬ 
bution is given by 


( X, 


"5* 2 ^nS* 


l- 


95 


...(*) 


where there exist numbers X» 2 , X 2 * such that X 4 *<X* when the 
probability is 0 975 and X 2 <X 3 2 when the probability 0 025. That 
is to say, confidence interval is 


c x ni> 2 nS 2 

( ° n) ' X, 2 <C < v” 


...(7) 


where X,* X, 2 =X 2 0 . 0 * 4 . 

Remark. The length of confidence interval is (—= — =1) nS 2 . 
The shortest confidence interval would be obtained by selecting 

** so as t0 minimize — —for the preassigned confidence coe- 

/.|- A. 

fficient I —e. 

16'6 Confidence Region for Mean and Variance of a Normal 
Distribution. 

In constructing a region for the joint estimator of the mean p 0 
and variance c 0 2 of a normal distribution, we recall that 

- " (6) 

s _ s 

</x<x-f/. 05 —-r;, and 


■°V(«-|) 

which provides 8 (O u ) : x—t 05 


i nS L 

(*’o o: 


Pl~:—<a 2 < 
026 


JL*1_. 95 

XVwsf 


V(n -1)' 


...(9) 


which provides 5 (O n ) : J^—<o 0 i < " S 


X*0*0*6 *0 #75 

By using the relations (8) and *9), we could construct a 

0*9025 (=0*95*) region under the assumption that the probability 



24 2 


Mathematical Statistics 


of both occurrences is the product of their seperate probabilities. 
This does not hold. The simple reason is : / and X 2 are not inde¬ 
pendent ly distributed, that is, the joint probability that 1*° 
intervals cover the true parameter va'ues is not equal to the 
product of seperate probabilities. Therefore the probability that 
the rectangular region of covers the following figure the true 

parameter point (/«•„, c 0 -) is not 0*9025. 


2 

* \ 


n f 




0-975 


. ' : 
. * . / / 

*/ *. 'As/A 

► ///A ' v / '// • 



*Zon 

j 

1 

i 

1 -* 

a 

x+i. 4 _ /U 

0<>s 


The reader must keep in mind the fact that the distribution 
x and E (xj—x) 2 ore independent! \ distributed. With this fact, a 
confidence region may be constructed. For instance, we wish to 


iltdence region. 

Then we 

it 


1 X — )i 

{-a< <a 

] = V(0 95; 

\ <W x » 


<^,j = v/(0 

r O,* 


...( 10 ) 




The joint probability is then 


r (x-x)* 


.. 1 -\ " . m-J t .\ i .* ; _ | 

/’< — <;<—7- <U, b\< --- <b 2 




95 ...(12) 


because of the independence of the distributions. We have in 
hand the four inequalities in (11) which determine a region in the 
parameter space which will be obtained by plotting its boundaries 
We may relpace the inequality signs by equality signs, and plot 
each of the four resulting relations as functions of p and g’ in the 
parameter space. T hen we will have a region as shaded m the 
above figure. 

Remark. A similar piccedure will lead to produce a confi¬ 
dence legion for (// 0 , g„). The relations would be plotted as 
functions of a instead of a-, and then we have a parabola (as 



Confidence Interval 


243 


exhibited in the following figure), which would reduce to a pair of 
straight /ines 


intersecting 


t*=x+ 


as 

Vn 


at x on the p -axis 



Note that the region constructed does not have a minimum 
area. The minimum region is roughly elliptical in shape, and is 
difficult to construct. 

Illustrative Examples 

Ex. 1. Compute 95 % confidential interval for the parameter 0 
m the distribution 

f (* l fl)=(fl-.v), 0< x< e 


for samples of size one. 

In fact, x is the observation. 


[Delhi I960; Patna 1955J 


Then 


f (x | 0J = — ( e — x) gives 


log f{x | 0)=const —2 log 0-f-log ( 0-x ). 
Hence the likelihood equation 

W to «|+ i =--0 


provides, U=2x, 

which is the estimator of the parameter 0. 

Then the distribution of the estimator 0 is 

« (5 I 0) db=f (X \»)dx\ : 





244 


Mathematical Statistics 


which implies that 

g (e\e)=^(e-pj-^(2e-e), 

0<6<2 0. Consequently, 95 percent, confidential 
are obtained by determining h t (0) and lu (0) such that 

gift 1 6) dti—0‘025 = 

0 
20 

g (0 | 0) d0= 0-025 

!h ( 0 ) 

From (ii), we have 

M*) 1 


...(») 

intervals 


A A 


0 


20* 


(20 — 0) d0=O*O25, 


from which we obtain 


i A S <*) 

1 ' =0-025 


..(ii) 


...(in) 


0 20* 0 l 0 

so that Oh i -0-025 0 2 =O, 

/. e. /;r (0)—40/ij (O)-bO-.O 0*=O, 

which yields : 

/; x (0) = 20±0 v/()* 9)=20 (1-v/(0*975)} 
Similarly, /i 8 (0)=2 (I—y'(0 025)} 0, 

Therefore, P {lt 2 (0)<0</ii (<0}=0 95, 

P {20 U —v/tO’Oi )T {1 — \/(0*97.^)} | 


...(4) 

...(5) 


or 


= 0*95 


20 (1-Vt0 
Inverting inequality leads to obtain 

0 




A 

t; 


< 0 < 


V(0 025)} 2 {1 —V(0 97^)} 


}=°. 


95 


...( 6 ) 



We plot (4) and (5) as straight lines as shown in the above 



Confidence Interval 

figure, and the 95 percent confidence insterval is given by 

( H 5 1 

|2 {i - %/(0-025)} <6< 2 {I - V(0-975)}j 


245 


F 12 {t — V(0*025)} 

Then the 95 percent confidential limits are 


A 

0 


and 


A 

H 


2 •1-V(0 02>)} ,,u 2{T=vWwj} 


which, in view of 6=2x , are 

r 


and 


1-V(0 025) l-V(0-975) 

Ex. 2. For the rectangular population 

dF= , 0 < x < 


...(7) 


and the confidence coefficient e the confidence limits for 6 are land 
t H l where i is the sample ran?e and «/» is given by 

fp n ~ l {n—(n— 1 ) <]/}= I —€ ...( 1 ) 

[M. A. Poona 19631 

Recall that 

fn (R; 0)~n(n-\) r-" R"~'- (r-R), 0 < R < 0 
which is the probability density function of tho range R Then 
/„ (/; 6)=n (/I- 1 ) 0-* (£/—/), 0 < / < 6 ...( 2 ) 

introduce the funct on ip = t/d. 

Then we find that the distribution of this function of the 
sample and parameter is independent of the true value of the 
parameter, and its probability density function, from ( 1 ), 

g(<P)=n (#i-l) (|— 0 ), o <KI ’ ...(3) 

Now we we select a positive number € < I (because it is 
customary to take e=0'95 or 0 99) and define <{,, from 

1 

g (»p) di\>=e 

which, in view of (3), becomes 

,1 

rt(*-I) 4,"-* (1-^) 

which yields : 

4>*~ l {n-(n—\) </<.}= l—« 


i. 


i. 


= € 



246 


Mathematical Statistics 


Irrespective of the true value of 9, 

1 } 

= P < t/9 ^ 1} "* v 

Inverting the inequality (4) furnishes : 

P {t < 9 < //W=€ 

The reader must bear in mind that / is the random variable 
in the statement but 9 is not. The random interval 8(O n )=(t, tlyt 
is claimed to be a confidential interval for c, and / and //<A* are 

confidence limits. 

1 67 Confidence Interval for large samples. 

it is a fact that for large samples, the maximum-liklibood 

estimator 5 for a parameter 9 in a density /(.v I 6) is approximately 
normally ditributed under rather regularity conditions. When 
these conditions hold, it is poosible to find approximate confidence 
intervals quite easily. The large sample variance of an estimator 

6 is given by 

“ 1 

^ 0) ~nE[e-i(db') log f (x | «)] -0) 

which is a function 0 because it ordinary depends on 0. 

For large samples, therefore, an approximate confidence inter- 
val with confidence coefiicient a is given by 



where d x 's chosen such that 

f* rf 

J-,4 V(‘") 

(2) can be written as 

P {d-d* a (6) < 0 < S+dj. a{9,} ^ a. .. (3) 

when 0 is asymptotically normally distributed and a {V) is the maxi- 

A 

mum likehood estimator of the standard deviation 0. 

Illustrative Examples. 

Ex. 1. F° r th' distribution : 

/(.v | U) = 0e 0 < a- < oc, 9 > 0, 



Confidence Interval 


247 


central confidence limits for a large sample of size n with confidence 

c °eJJicient « = U 95 are given by 

[Patna 1952, Agra B. Sc. 1972) 

For, our hypothesis gives / ( x | 0)=0 e~'° 
so that the LF is 


L{< | (/)= H f (xj | 

y=o 

This log Z.(.v | 0)= /2 log 0-t>27.xr, 
which, on differentiation with respect to 0, furnishes 

3 log L (x | ft) n 

to ~e w *• 

The likelihood equation : * I 

C/ 0 

leads to obtain 


giving, 


0 = 


n/0— 27 .v, 

I 


Now V (0) = 


jc • 

I 


I //) 


t/ 2 


()0~ 


I - £ (-*r) " 


for £ (l/tf«)= f” -L tf c -.. ( /. v= J_ f , A . _ J_ <r > 

»- or ax -T~e 


= J_ 

6 2 • 

A 

Thus o(#)= 4-=-‘ 


-A0 i 00 

T [ 


Evidently 


V7i = TVi, in large sam P'«- 

0—0 a/x) - 0 


o{0) 


l/(xV'i) 


is a normal variate with mean zero and variance I. 

(l/.v)-» 


Letting. 


1 !(xs/n) 


±d^± V)6 


I 


such that 


f _— o 

= < • L T— < Cl: 


1 


h 


95 



248 


Mathematical Statistics 


provides : 


X Xy/n 


from which we obtain 




0 =[ 1 ± 


1*96\ I- 


y/n 


)h 



e X 
x l 


which are the desired central confidence limits of 9. This 

completes the solution of the problem. 

Ex. 2. Set up confidence limits of 6 the Potsson dtstnbu 

whose general term is 

/(* I A) 

Fvidently we have 

n 1 

log L (x 1 A)- "A+.£ i X, log A+log -y ^ 

which yields : __ 

a log L (x \ X) _ 

d\ a 

. d log L (x I A) 

so that the likelihood equation ^ 

=0 leads to obtain 

A __ 

A=x 

which is the ML estimator of the parameter A. Thus 


a 


* A 


K(A)=K<*)=-=- 


Define d as follows : 

- _ a log Z, (jc 1 

d3 |./8 l og LjxTtD 

L v « 

so that d is a standardized normal variate in large samples. It is 
important to note that from the normal integral we may oompute 
confidence limits for 6 in large samples provided that d is a mono¬ 
tonic function for 6 so that the inequalities in one may be trans¬ 
formed. __^ 

Now d=-^-{x—X)ly/(nfX)—^^i ri y ...(1) 

with l-a-0 95. Corresponding to normal deviate ±T96, the 
central confidence limits in large samples is given by 



Confidence Interval 


249 


a— A 
a (A) 


x—A 


=^=1*96, 


which yields : 

*=X±d a V(x/n)=x± 1-96 v'^/«) 

Remarks R 1 . For large samples, with </*—1 96 

^ {x-da a(S) ^ A < A' -f *dx cr(0)}=0 95 
yields ; 

^{*-1 96 V(a/*) < a-1 96 y/{x/n)}= 0 95, 

from which we have in hand the desired 95% central limits for 

-A A* 
f{x,X) = e — 

• 


Ra- By (1), with 1—a=0 95, we have 

J-y (a—A) =-= zh 1*96, givi 


(a-A) 2 ^?=(1*96) 2 , 


i.e. 
i. e. 


a 2 -2Aa+A 2 =1-96) 2 (A In) 

A 2 —2A (*+^*)+* 2 = 0. 

A*-2a( x+^)+x 2 =°, 


from which we find 


, _ , 3*F4 , /3 84a , 3-69\ 1/2 

x - x+ 7r ± [-ir + ^j 


To the order n~ 112 this is equivalent to 

A=a± 1 96 y/(x/n). 

This asserts that the upper and lower confidence limits are 
are equidistant from the sample mean a. 

Ex 3. Compute the confidential interval for binomial distribu¬ 
tion with parameter p and then obtain confidential interval for large 
samples.. 

As a matter of fact 

H— p) 


Now 


{- 


a 2 (p) 


d * < Vwc P -p)i») " t/x 




I _1 /2 

where dx is so chosen tht I -7rr\c “ o/=a 



250 


Mathematical Statistics 


This implies that 

_ P-_£ _I <d.=> 

ViPi l -P)ln) I 

p2 +p 2^2pp = ^ ! ~ - P W =► (fi+p 2 -2pp)n=p (1 -p)da? 


=> p 2 (h+</•*) — P (?pn-+-d 2 o)-\-np 2 =0 => 

?;i P-{-dr t 2 -\:d- z \/ (4n n4-d-<x —4np 2 ) 

P ~ 2{n+d«*) 

after slight simplification. 

Then converting inequalities (1) leads to obtain 


i 


(2 np 4-r/ a g —dg.\/(4np-t-d 2 % — 4 np l ) 


' 2 («*+</* 

2np4-«/ ? a + ^a\/(^Wp-f^ 2 a— 4wp*) | 

2 (* + <*”«) f 


< P < 


•( 2 ) 


Then 


2np+d*x—day/i^np-^d^-inp^) ^ n ^ 

2(»+^J " ^ ^ 

Irp+d^A-d* V (4np±d*'t— 4np*> 

2 (iH-dM 


..(3) 


provides the confidential interval for p. 

Recall that the asymptotic normal distribution is correct 
only to wihin error terms of size kfy/n. Hence we may simply neg¬ 
lect terms of this order in the limit; in (3) without affecting the 
accuracy of approximation. Th/s implies simply that we may 
omit all the d\ in (3) Reason they always exist added to a term 
with factor n, and is negligible, with respect to n when n is large, 
to vs it/jin the degree of approximation we assuming. Therefore (3) 
assumes the form 

-1 f-^A^ 1 < ? 

~a ...(4) 

Particularly, 

P jp-l 96 ^ p < p + 1-96 ]’ ,2 j^0 9S 


Confidence Interval 


251 


gives a n approximate 95 percent confidence interval for p for large 
samples. 

Remark. For large samples, we may use the formula 

P {0 — dx o(6) ^ 0 < U + dx <7(60} a 

directly, where U, d x and a ( 0 ; have their usual meanings. 

Ex. 4 . (due to Huzabazar) A confidence interval for 0 in 

dF=exp { — (.v— 6} dx, 6 < x < 00 
is obtainable from 

P log a < 6 < x (l) }=l-a 

M A. Poona 1%2) 

For, the distribution of the smallest variate x n is obtained 
from 

dF = ( r _ 1; ”„_ r) ; ( F (- y r)} f ~* ( * -F(x r )) n - l f(x r )dx r 

on replacing r by /<. Then 

dF=n F(Xn\ n - 1 f (x„) dx„ 

Letting F(x n ) = i\> provides : 

dF—n d‘!> 

Set 0=exp { — (*<d- 0)} where x (1) is the smallest observation. 
Then 

P (a ,/n < 1 // ^ 1 } = 1 — a implies 

P{ a l/n < exp{-(* (I) -0)} < 1}= 1 —a => 

P{\ln log a < -x (1) -f-0 ^ 0} = 1 —-a => 

P {*<i)4-J In log a ^ ^ * (1) }=l_ a 

This completes the proof. 

Ex. 5. One head and two tads resulted when a coin was tossed 
three times. Find a 90% confidence interval for the probability of 
head. 

Denote the probaoility of head by p Then the probability of 
success is 

f(x)=p x qi-°, x=0, 1 
and hence the L F is 

L(x | p)=p‘ 5x 




Mathematical Statistics 


so that log L (.x 1 p)={Ex) log p-{n-Ex) log q . 
equation d/dp log L{x | p)=0 gives : 

Ex n—Ex 

"F i— 

which provides : 

''Ex - 

p= v =x ' 

the sample mean. 


The likelihood 


The distribution of p is then 


dF(p)=("\p)”P (1-/>)" (1 P) d P 

tip / 

Now wee seek to find h t (p) and /i 2 (p) such that 

nhM /„• n 


MI'P) /„\ 

s r p'(i-p)»-' 

v=o 'y' 


E P v (l-/>) 
nh a (p) 


n-y 


where y=np=Ex. This shows that nh(p) is an integer, and hence 
the sum can not be made exactly equal to e/2 for each value of p. 
Any how, we have to determine the value ofp for which 


E (”} p> (l-p) n - ,, =«/2 

y= 0'- v ' 

If e-OTO is the coefficient level, then letting np = 1 provides : 


np(l-p)= 05, giving 


np 2 —np-\- 05=0, 


which yields : 

r» + \/ (>* 2 ~ 0 In) 

P=~ - Tn 

Let /i = 3. Then we have 


5 ± 27 n ' /{n_02) 


p=i±T?3 V(2 8) = J±C-483 

Thus confidence li.ni:s for pare 

|4-0 483. * — 0-483, 

i e. 0 983. 0T7, and 90% confidence interval is (*107, 0 983). 

16 8 The optimum pronerty of the large sample intervals and 
and regions based on maximum likelihood estimators is as follows : 


Confidence Interval 


253 


Large-sample confident* intervals and regions based on ML 
estimators are smaller on the average than intervals and regions 
determined by any other estimators of the parameters. This pro¬ 
perty of ML estimators is interlinked with the fact that they are 
efficient i.e y they have minimum variance in large samples than 

other estimators. 

Illustration. We recall that the shortest 95 percent interval 
lor the mean of a normal population, when g is known, is given 


_ l-96q 


!*- 


<, < 

V 


* 1 - 


95 


The length of interval is 2 • (1 96 a/y/n) where n is the sample 
size. Now assume, in order to construct the confidence interval, 
that only one of the observations, say the first, will be used instead 
of x=\/n £ x t . The estimator is then 


P=x x 

and hence the confidence interval is given by 

P {,T+ 196o</x<^+l-96o} = 0-95, 

which has the lengh 2x I 96g. Conclusion : this interval is y/n 
times as large as the one obtained by using the sample mean as 
the estimator. 

16*9 Central and non-Central intervals. 

Assume that O n : (.x^, x 2 , ..., x n ) is a random sample from a 
given population. Recall that 

P {‘i < 0 ^ 

where t 1 < 6 ^ t t is a random interval such that and are 
functions of x u x n Lvidcntly 

1 ~U=P it, < 0) 

and 1- e 2 = P(0^ / 2 ) 

when equal deviations Irom the means are taken f.om the mean 

(equivalently the sampling distribution on which the confidence 
intervals are based is symmeuical). 

Further assume that the conli lence coefficients e 1 andc 8 are 
equal. Then the intervals are said to be central. In such a case 
we have 

P{t x > 6}=P{d > r a }=</2 

In the contrary case the intervals are said to be non-central. 



254 


Mathematical Statistics 


Caution. Centrality in this sense does not imply that the 
confidence limits are equidistant from the sample statistic unless 
the sampling dissribution is symmetrical. 

Illustrative Examples. 

Ex. 1. For two normal populations with means and m and 
variances a l 2 =o 2 i = a 2 t independent samples of sizes n t and n 2 res¬ 
pectively are drawn. 1 hen 




112 


{where x lt x 2 and s x 2 , s 2 * are the sample means and variances ) is 
distributed in “Student's" distribution with (« 1 +/7 a — 2 ) degrees oj 


freedom , and hence set confidence limits for (pi-p-*)- 


Our hypothesis provides : 

■v, ~ <**) =► ^ °*M 

and x a A r (/i 2 . Ga 2 ) =► x 2 ~ N(p 2 , “/^a) 


Then (Xj— p t ) ~ yv(0, crM) 
and t-v 2 -/xa) —' A(0, g 2 2 /h 2 ). 

As a matter of fact, 

£{ - p ,) - {x 2 -txfi) = E( *, - f q) - E(x a - ^=0 
and V-Xi — /q —(.v» — f l 2 )}= l/ (*x~ /*«) + V{ x 2 —lh) 

because cov {(.v, — //,) x 2 — p 2 )} = 0 
in view of independence of samples) 

=(o 2 /« 1 )+(a>«) since G 1 2 =c 2 2 = a 2 
But [(a*,— pi')—(. x 3 —a- (l/"i4- /'»?)} 

Then {(Ai-p 1 Xx 2 -/i 2 )}/{o :: (l/«i+ l/'hs)} ~ —(0 


But /7 i v | «= 27 (x,f— A* t ) 2 , and hence 

E (Au-.Xl ) 3 
W,5 1 “ / G , = - 

G“ 


•/-l 2 

« — 1 df 


Similarly, 


Then 

Now 


n 2 s.,~_S (x 2 < Xj) 
O* G“ 



«-l df 




(Xi-zi^-fX,-p 3 )l 3 


"li‘ *-'/ 3 f a - , ■ /„ \ 

» 1 + « 2 -2 



Confidence Interval 


255 


= {*i- lh) r (.v 2 - fx 2 )y h h + „ u 2 
°'(W^i-h l/'n>) I Q i» 


Then In view of(I) an d(2), 

£1 = V with o nj d f 

, . v V witn one dj~ ~ P* (l* v /2)> 

which provides : 


where v= ni +n*- 2 


dF[^-\= _1_ ( / , /v) , «-« </(/*/v) ^ 

1 v ' P (*. v /2; \77T rT+nTT’ 0 < ' 2 /v < co 


so that 


(i+|S/v /v+j)/2 


(/)= _L '- 1 v ’/2 n/v) 2 / dt „ 


= -L _ dl 

Vv 3(X, v/2) • — ^+1)71* - ^ ^ oc 

which is / distribution with v~/, 1+/J _?,//• T . • 
assertion. - 2 a.f. This proves our 

The /-distribution is symetricTl anr i 
can be determined. , % con(jdcnce iD “ 

« IIpll . V* »i" a -2) f 

or p,J P f ^ r p)~l-€ 

This can be obtained from (he table. 

Ex. 2. />o*i /VV£ , populations whh 

and variances of, a 2 fa. 2 s / , •■team Ml and p 0 

u i » a a 9== <v), independent samo 1 ** nf 
" 2 respectively are drawn. Then P J 6S "» af,(/ 

>.*/«, + *k$~l ( I^»/(». +",-2)}’'' 

Wh ^ZbuLTZw,T+f->'TV, is d,s,,i ' 

confidence limits for (/x 1 —^ t ) 1 ' 2 - *v and hence set 

Following Steps of the’preceeding example. 

{(^—Ml)~^ a —/x,)> ~ Ar(o. 

\ «1 /7o / 


Thus 


256 


Mathematical Statistics 


Also 


ya and hence 

«j—1 df 

"‘ s '’/”' + " :Sll / 5 ’t+- 2 ^/ 

Therefore, ~ P. «• ^ ,2 > and ,hU5 

ni+n*- 2 

— — i( n 9 2/ ai 2 + W a 5 2 8 / OJ 2 )/K + ”* “ 2 ^ 1/2 

(ai 2 /«i+a 2 2 /« 2 * ' z confidence limites 

is Students’ distribution with ni+n t —2df. p/ 0 

are .. i 


Xi — .Voi Ip 


Exercises 


1. C/ji/ig r/i<? density fnnction /(*)—jjl * the lar 

test of four observations from a rectangular population, set a 
^general system of 95% cogence intena, for 6 by obtaining > h (« 

and (0) and plotting these in the (0, 0) pl*** e - Find tllK M,t ' V 

for the sample (2 6, 1 2, 4 3, 1‘6). 0 f a 

2 Find a 90 percent confidence interval for he «nea 

normal distribution vwith o=3, given the sam P e unk own ? 
-9). What would he the confidence interval ,1 . 

3. Five samples were drawn from populations assumed to 
be normal and assumed to have the same variance. 

j2«=£ (xj-xf and n , the sample size, were 

j« : 40 22 17 42 15 

n : 6 4 3 7 8. 

Find 98% confidence limits for the common variance 

4. The sample (2-3, 1*2. 0 9, 3-2) was drawn from a pop^ 

tion distributed by /1*)=0 e" 9 *, *>°- Find a 

confidence inter\al for 0. „nmilaiion 

5. Given a sample of size ICO from a norma p P 

with /t = 3, a 2 = ‘25. What is the maximum likelihood estimate of 

the number 0 for which 

1 - exD J- ( —1= 05 

J. wvcSo P i^ ™ 1 

[From the table. 8^1-645 = l 822>] d 

6 (a) Discuss the relations between sufficient estimators 

minimal properties of confidence intervals. 


Confidence Interval 


257 


(b) Show that for a rectangular population 

dF=-*, 

and confidence coefficient a, confidence limits for 0 are t and //«/> 
where t is the sample range and 0 is given by 

{*—(*_ {) ^} = i_ a [Patna 1958, 1951] 

7. Discuss the general method of finding the confidence 

interval for a parameter. [Patna 1952, 55, 56. 57. 59] 

Obtain 95% confidence intervals for the following : 

(a) Mean of a normal population when the variance is (i) 
known and (ii) unknown assuming that a sample size n is avai¬ 
lable. 

(b) Variance of a normal population. [Patna 1956] 

8. Show that if z be the largest of samples of size n for the 
rectangular population 

dF=~, O<*<C0, 

confidence limits for 6 with confidence coefficient a are z and 
z/(l — a) 1 '**. [Patna 1959] 

9. What is the difference between confidence interval and 
point estimation ? 

Find 95% central confidence limits of 0 for large samples for 
the distribution 

dF=de xe dO, 0<x<oo [Patna 1951, 52] 

10. If and x 2 follow the rectangular distribution 



and L is the larger of a sample of two, show that confidential 
limits corresponding the confidential coefficient a are given by L 


and 


L 

11, Obtain the 95% confidence interval for a 

f(x, a )=4~ (a-*), 0<x<* 
a* 



for samples of size one. [PatDa 1955] 

12. A random sample af two members from a rectangular 
population with range 0 is 9, - 1. Obtain 81% confidence limits 
for 0 from the sample. [M- A. Poona 1959] 



1/ 

TESTS OF HYPOTHESIS 

17*1. Definition. “Every particle of matter in the universe 
attracts every other particle, or life exists on Mars’* is a scientific 
hypothesis. 

Assume that we have a set of random variables x lt x 3 ,..., 
which are represented as the coordinates of a point x, say in [/i] 
or jR„ one of whose axes corresponds to each variable, x, being a 
random variable, claims to have a probability distribution. We 
select a region, say w, in the sample space 51 such that the pro¬ 
bability that the sample point x falls in a* is P {x « a>}. Then any 
hypothesis concerning P {x e w} is a statistical hypothesis. That 
is to say, any hypothesis concerning the behaviour of observable 
random variables is statistical hypothesis. This implies that a 
statistical hypothesis is an assumption about the frequency func¬ 
tion of a random variable. 

Illustrations 

Ii- (For a discrete variable). Denote the proportion of all 
insects possessing the less common markings by p. Then the 
assumption that p—\ is a statistical hypothesis. 

I 2 . As an illustration for a continuous variable, assume that 
the random variable denotes the time that elapses between two 
successive tippings of a Geiger counter in studying cosmic radia¬ 
tion, and further assume that the frequency function for x is a 
function of the form 

/(* I b )=9 exp (—Ox) 

where 6 is a parameter whose value depends upon the experimental 

conditions. The assumption that the frequency function is a 

function of this particular form is evidently a statistical hypo¬ 
thesis. 

I 8 . The hypothesis that the parameter 6 is equal to 4 is also 
a statistical hypothesis. 


* [n] denotes the n — dimensional sample space. 


Test of Hypothesis 


259 


1 4 . The hypothesis that a normal distribution has a given 
mean and variance is statistical. 

1 5 . The hypothesis that it has a given mean but unspecified 
variance is a statistical one. 

le. The hypothesis that a distribution is of normal form, 
both mean and variance unspecified, is statistical. 

I y . The assumption that two unspecified continuous distri¬ 
butions are identical is a statistical hypothesis. 

Note that each of the above illustrations implies certain 
properties of the sample space. Hence we can translate each of 
them into statements concerning the sample space. 

Definition. The hypothesis concerned entirely with the value 
of one or more parameters ot the distribution under consideration 
is said to be parametric. 

Illustrations I x — I B are clearly parametric hypotheses whereas 

1 6 and I 7 are non-parametric hypotheses. 

17 2. Simple and Composite hypotheses 

Definition. Assume that we have a distribution depending 
upon k parameters, and a hypothesis specifies unique values for j 
of these parameters Then the hypothesis is said to be simple 
if j=k, and otherwise composite if j<k. 

In geometrical terms, it is possible to represent the possible 
values of the parameters as a legion in [A:], one dimension for 
each parameter. The h>pothesis, which chooses a unique point in 
this parameter space, is a simple hypothesis. But the hypothesis, 
that selects a sub region of the parameter space which contains 
more than one point is composite. 

The number k—j is known as degrees of freedom of the hypo¬ 
thesis, and j is called the number of constraints imposed by the 
hjpothesis. 

Illustration. Let 

t <* 1 "' )= 07 im exp H (VTf• 

Then the hypothesis //„ : 0j = 5, 0* = 2 is a simple hypothesis 
whereas the hypothesis H 0 : 0i = 5, 0 2 <2 is composite. 

Defio tion A test of a statisiicai hypothesis is a procedure 
for deciding whether to accept or to reject the hypothesis. 

17 3. Definitions 

Critical regions. In order to test any hypothesis on the 
basis of a random sample of observations, we shall divide the 



260 


Mathematical Distribution 


sample space ft (/. e all possible sets of observations) into regions 
t<»andft\aj. If the observed sample point x falls into w, we 
reject the hypothesis; and if x fall into the complementary region 
Sl\<v, we accept the hypothesis. We call w the critical region of 
the test, and ft\o> the acceptance region This is to say, the 
critical region of a test of a statistical hypothesis is the part to of 
the space ft which corresponds to the rejection of the hypothesis 
being tested. In fact, ft = <oU(ft\w) 

Definition. Null hypothesis H 0 is the hypothesis of no differ¬ 
ences. It is usually formulated for the express purpose of being 
rejected. If it is rejected, the alternative hypothesis//, may be 

accepted. 

Alternative hypothesis //, is the operational statement of the 
experimenter’s research hypothesis. 

Research hypothesis is the prediction derived from theory 
under test. 

Illustrrtion. To test the research hypothesis, we state it in 
the operational form as the alternative hypothesis H x . Then H x 
would be that i. e. that the mean amount ol time spent in 

reading newspapers by the members of two populations is unequal. 

But H 0 would be that ^, = n 2t that is, that the mean amount of 
time spent in reading newspapers by the members of the two 

populations is thr same. If the data allow us to reject H 0 , then 
//i can be accepted and this would support the research hypothesis 
and its underlying theory. 

Nature of research hypothesis. This determines how H x 
should be stated. It the research hypothesis simply slates that 
two groups will differ with respect to means, then H x is that 
But if the theory predicts the direction of the difference, 
/. e. that one specified group will have a larger mean than the 
other, then //, may be either that ni>p t o* that p x <p%* 

Level of Significance. Definition : The significance level a is 
the probability that a statistical test will yield a value under which 
the null hypothesis will be rejected when in fact it is true, that is, 
the significance level indicates the probability of committing the 
t>pe 1 error, i his implies that we can find w such that, given H 0 , 
the probabi ity of iejecting H 0 (/. e. the probability that x falls 
in 10 ) is equal to a pre-assigned «. 


Tests of Hypothesis 


261 


In notation, \ 

o<=P {committing type I error} 

= P {reject //„ | // 0 is true) 

= P {xe co \ H 0 ) 

The value a is also called the size of the test. 

Caution. In dealing with a discontinuous distribution it is 
not possible to satisfy (I) for each a in the interval (0, 1 ). 

Definition. 0 is the probability that a statistical test will 

yield a valu: under which the null hypothesis will be accepted 

when in fact it is false, that is, Ogives the probability of commi¬ 
tting the type II error. 

In notation, 

P=P {committing type II error) 

= P {accept H 0 | H 0 is false or H, is true} 

= P (x e | //,} 

We assert that the errors made in testing a statistical hypo¬ 
thesis are of two types : 

(a) We may wrongly reject it, when it is true; 

(b) We may wrongly accept it, when it is false. 

These are known as Type I and Type II errors respectively. 

Power of the test. Definition : p, probability of Type II error 
is infact a function ot the alternative hypothesis, say H u i. e. 

D {x c £l\cu | //,} = 0 

oi P {x t u) \ Hi } =! p ... (2) 

This complementary probability, 1-0, is said to be power 
of the test of the [hypothesis against the alternative hypo¬ 
thesis H x . 

Remark. “Power is a function of H” claims that the 
specification of H, is essential. 

P, the probability of committing a Type II error, decreases as 
the sample size n increases, and therefore the power increases with 
the size n. 

Definition. A critical region, whose power is no smaller than 
that of any other region of the same size for testing a hypothesis 
7/ 0 against the alternative hypothesis //, is said to be a best 
critical region (abbriviated BCR), and a test based on a BCR is 
called a most powerful abbreviated MP) test 

Unbiased Test. Definition. Assume that there exists a size * 
critical region oj for H 0 : tf=»0 o against simple //j : (\ = e such 
that its power. 



262 


Mathematical Statistics 


P {x € OJ | 0l}^Ct 

or P {x e io | 0 o }<a 

for a given level of significance a, 0<a<l. Then the test of H 0 
against H x is called an unbiased test. 

17 4. Testing a simple H 0 against a simple Hi 

Neyman-Pearson Lemma. Assume that there exists a critical 

region oj of size a and a constant A such that 

filial <A inside <u 
L {x | H x ) 

mj l l l* f 5j ° uaUt u 

where L (x I H t ) denotes the likelihood function given the hypothesis 
Hi (i—0, 1). Then oj is a best critical region (or BCR) of size a. 

Proof. To prove this contention, assume that oj is a BCR of 
size a. Then we have 

1 —j9= [ L(x\H 1 )dx, ...(0 

Jo> 

We now wish to maximize (1) for choice of w, subject to the 
condition that P {x e co I H 0 }=« , 

f L (x | Hq) dx=a ...(2) 

Jo> 

Now that J </*=/.../ dx\.. dx n . 

(1) can be rewritten as 

- (3) 

which is maximum. This implies that we i. ave to select a> to 


/. e. 


maximize the expectation of 


L (x | //,) 


in id because to is assumed 


L (x | H 0 ) 

to be a BCR. This is evidently done iff u> consists of that func¬ 
tion a of the sample space ft, which contains the maximum values 
L (x | H x ) 


of 


This asserts that the BCR will consists of the 


L (x | H 0 ) 
points in ft, satisfying the inequality : 

L {x | //„) 


L(x | H x ) ’“W 

where A* is so chosen that the size condition (2) holds. This is 
true for any a provided that the joint distribution (/. e. LF) of 
observations, is continuous. In this event the sample points in ft 
satisfying the equality : 

L lx | H 0 ) 


L (a- | If) 


=A* 


...(5) 


Test of Hypothesis 


263 


will form a set of measure zero. 

The likelihood functions are infact non-negative. This claims 
that A a (their ratio) is necessarily positive. That is to say, given H 0 
the value of A depends on the size a of the critical region’ « and the 
alternative hypothesis //, such that A is a function of both a and 
H x . This finishes the proof of the proposition. 

Remark. Let the distribution not be continuous. Then it 
can be rendered into a continuous one by a randomization device 
In this case ( ) holds with some non-zero probability p and we can 
only select A* in (4) to make the size of the test equal to 

a-q (0 < q < p). 

If we wish to convert the test into one of exact size a, we use a 
random device so thas we rej ct // 0 with probability q/p where // 
is accepted with probability \-qip. The probability of rejection 
is then (a.—q) + p.ql P =* a s de : ired Note that in this event the 

BCR is not unique because it is subject to random fluctuations 
Exercises. 

Ex. 1. Test the hypothesis H 0 : y—y. against : y=y x < 
in the distribution 


fi.X | y) = 

Evidently, 


1 




exp. {—I (x-/*) 2 }, — co ^ x < cc. 


n 


L (x | //,)~(2 t 0-"' 2 exp. {—A 27 (x,- /t< ) 2 }, i= 0, I, 

j = l 

= {2tt)-"/ 2 exp [—n/2 {5 2 +(x-^) 2 }] .( 1 ) 

where x, s a are the sample mean and variance respectively- Then 
for the BCR, we find from (4) 

Kx I //,) =exp [l to*.—#*»> 2 jc+((*, ! — sr A ai ...(2) 

which yields : 

^ < 2 log Aq4-(^»-^») 

2 «(/h>-^) ...(3) 

This shows that given // 0 /t, and «, the BCR is obtained from 
the value of the sample mean x alone (Note that x is a MVB 
sufficient statistic for y, y 3 < y u implies that the BCR, inview of 
(3),), is then 

* < (/^o+Mi)+log A *}/{n (Po-ih)} ...(4) 




264 


Mathematical Statistics 


whereas pi > p 0 provides the BCR : 

X > \ (/^o-hf^l)—1°8 **/{« (to-PoOl —( 5 ) 

which is again reasonable on an intuitive basis. Now we infer 
that in testing the hypothetical value p 0 against a smaller value 
,t it we reject p 0 if the sample mean .v falls a certain value depen¬ 
ding on a, the size of the test, and in testing p 9 against a larger 
value p lt we reject p 0 if the sample mean x exceeds a certain 

vaiue. 

Remrrk. The best critical region here is the left tail of the 
x distribution, that is the BCR is determined by a single statistic, 
which is called a test statistic. We note that whatever the value of 
ix, x is itself exactly normally distributed with mean p and variance 

i/n. Then to determine a test of size a for testing ft 8 against 
> fx 0 , we have to compute x* such jhat 

oo 

f (^■]' ,a exp{n/2.(x- ( , 0 )*}</S=« -(6) 

XoL 

Define C?(.x) as follows : 

G(x)=> | exp (— iy')dy -0) 

- oc 

Then we have, for Mi > l l o 

x x =p 0 +dj n 11 * 

where G (—d a )=x (9) 

The power of the test ir then 

AV exp {—w/2.(^-f‘i' 2 }=l— P ...(10) 

ITT 1 

In view of (8), the integral may be standarized such that 

1 — G {n l l % (p a —Pi)+d aL )=G {n 1/1 (fti—ft 0 )—d a }, ...(11) 

because G (x)=\-G (—x) by symmetry, (11) indicates that the 
power is a monotone increasing function both of n the sample 
size and of the difference between the hypothetical values 

between which the test has to select. 

Ex. 2. Test the hypothesis H 0 : 6*=d 0 against the alternative 
hypothesis //, : 0 = 0, < 0 o in the distribution 

f (x | 0) = 0 exp (-Ox), x> 0 

Evidently, as j runs from 0 to n, 

L(x ! H n ) = \\f(xj | 0 o )=0 o rt exp (— 0 o 2 xj) 

L[x | //i) = <Y* exp (—0, 2 */} 



and 


Tes is of Hypothesis 


265 


so that for the BCR , we have 

L(x I H n ) _ 0 O " exp { —0 O Z *< ) = [^o_\ n 
L(x | H x ) 0 X " evp { —0, S at j) \0j J 

x exp {0i — 0 O ) Z X J < Ax 

This inequality may be written as 

exp {(0—0 9 ) 27 Xj) < A a (0 1 /0 o )s 

which, on taking logarithms on both sides, assumes the form 

(0!-0 o ' Z x t < log (A a .(0 1 /0 O )"} 

In view of our hypothesis, H x specifies that 0, < 0 O - Then 
division of both sides of the preceding inequality by 6 X — 0 O leads 
to obtain the reverse inequality. 

^ ^ log {VW} 

2 ^ 01-00 •••(!) 

Set w=l, 0 O = 2, 0,= 1 in (1). Then the BCR t by dint (I), is 

r; l Og Aq (0l/0Q l i A T 

Vi-0. ~ g 2 ' 

where A a is chosen to make the probability 0*135 that x will exceed 
x. Therefore, the right tail is better than the left tail for the 
problem undar consideration, and is the BCR. 

Ex. 3. Test the hypothesis // 0 : 6=0 against H x : 6<=> 1 in the 
Cauchy distribution 


dF(x) 


dx 


— oz<x 


CO 


7i{ 1 -h (x - 

For our convenience we restrict ourselves to the case n 
Recall that the BCR is given by 

L (x | H n ) 


= 1 


L (x | H x ) 
Then the BCR is given by 


< A 


L(x | «’ 0 )_1+(jr-l)* 
L(x\H x ) \+x* 
which yields : 



* 2 (A a -l) + 2*--f(A a —2)> 0 ...(1) 

We observe that the form of the BCR thus defined depends 
upon the value of a chosen. 

Letting A a = l in (I) provides : x > »/2 so that we should 
reject H 0 : 0=0 in favour of H x : 0, = 1 whenever the observed .x: 
was closed to 1 than to 0. On the other hand, setting A a =0*05 
leads to obtain 

x*-4x+3 < Oor (x-2)«<l, 
which gives : I < x < 3. This is the critical region. 



266 


Mathematical Statistics 


As a matter of fact, the Cauchy distribution is Student s 

distribution with 1 degree of freedom and hence 

/•(*)=$+(!/*) tan- 1 x. 

Then we may compute the size of each of the two sets. 

For A a — 1, the size is 

P{t ^ l/2}=0 352 

whereas for A x =0 5, the size is 

P{ I < t < 3}=0*148 

Ex 4 Test the hypothests H 0 : m=/*o against : M-—Mi 
in the distribution 

' ' 11 °>= txp {-i 

Hence find the power of the test. 

ID. U. 1963; Bombay 66, Sardar Patal 68] 

Now L{x I H t ) 


*= {a C(2re)}"" exp 


= {®v / (2^)}' fl exp 

|- 2 t> 

where x and j 2 are the sample mean and variance. 

The BCR is given by 

AM 

L (x 1 W [ 

L (x 1 «,) eXP | 


«=exp 

|-^- s {2x (Mi-Mo)+(Mo 2 -f i i i )} J 


I) q 2 

Then x (Mi—Mo') > ^ 0 ~~~ lo ? 


We conclude that the BCR is determined by x alone. If > m 0 * 
the BCR is given by 


gdlV* +£ )og ^~=d u say 


v > - f 

- n Mo “Pi 

If mi < Mo* the SC/? is given by 

z. ^ MitMu^p 1 I q 8 A a 

w 


*..(!) 


=^ 2 , say 


2 ‘ n Mo—Mi " 2 ’-(2) 

We have to choose constants ^ and d t to make the probability 
of each of the relations (1) and (2) equal to a when the hypothe¬ 
sis H 0 is true. The sampling distribution x when //, is N (m<J ° 2 ! n ) • 
Then we find 



Tests of Hypothesis 


267 


and 



Applying the transformation 




V n ( x- f i Q ) 

(7 

to relations (3) and (4), we have 


1 I 

°° 


V (2rr)J 

, ex P {—\y l ) dy=a. 

y/n (A,—po) 


1 

fV n (Aa-Po) 


V (2 n) . 

exp (— &y 2 ) dy—a 

] - DC 

...(6) 


But there exists A a such that 


ex P (— iy 2 ) dy= a 

which is the desired sample size. 

whose numerical value can be computed from the tables of 
areas of normal distribute for any «. From (o) and (6|, we 



V n 

A & 

“a —Po-7- 

Vn 


Power of the test 
By definition, 

Power of the test= 1 — ft 


A* 

A* 


= P{Xero\ //,} = /> {x > ( J l | H x ) 

D f 00 Vn ( n 2 ) 

)d\ ay/(zi t) CX ^ ( 2a* ^ j 

= C 1 

V ,-M 

a 



( Pi~ 1‘ oWn 
a 


J / v/(2»r) exp (—£z) </z. 


...(7) 

...( 8 ) 


using (7) 



268 


Mathematical Statistics 


Letting A=r« provides. 




V( 2 *r) 


exp ( — \z 2 ) dz 


i-h Hz)dz 

J - OO 


= 1 —/^a). 

Particular Case. Let x be normally distributed with a—10, an we 
wish to us, H, : (i -100 against Hu *-110, Then how large should 

a sample he taken if probability of accepting H, when H t is ' rue ^ 
0-02 where a critical region of she 0 05 is used. (Agrs B. he. 7U) 

Now //„ : =100 against // x : 110 => 

,x o = 100 . /Zl = 110 say => f*o < Pi 

Using (I). the B.C R is given by 

* > d x 

Then f U-MolV) WP-'-p CM 
BivinB ^ = 1 ' 645 -w 

Also /J = 0 02= V n, |/ v '(2n) exp (-Jr*)* 


- oc 


giving : 


= 2-055 

ol\/n 


...( 10 ) 


Replacing/i 0 = 100, ni= 110, a=10 in (8) and (9) leads to 
find 

^=100 + ^- 1 *645 = 110— - 7 -= *055 
1 \/n V n 

from which, we find 

3'70/\/n=l=>u=(3-6) 3 =lV69-£—4 

Ex. 5. Test the hypothesis //„ : n=a 0 in N (0, a 2 ) against H x ’ 

O — (T rt . 

The BCR is given by 

L x exp f l (1- 1) 2 X* 1 < Aa 

L{y | //,) 'a 0 / I Va , 2 <to*/ J 

/ t \ 


where j runs from j= 1 to x 


...(1) 


i a "\ °.' a 2 *<• < « log f —) + l->g A, 

- Ofts 0* V«1 / 


1 





2 d'A 


Tests °f hypothesis 
which provides : 

2 XjZ ^ 2<J o z<J i a i n log (oofaj + iog A a ) 

netting a 0 > aj yields the BCR 

V y,2 .c- ?<To 2 ^1 2 , . 

J ^ l°g (<W<*i)+log A»} = </ lt say 

a nd setting Oo < a* provides the BCR 

y v o \ 2 <t 0 2 o 1 2 

^ — ^2 _ Gi 2 i n (^ 0 /^ 1 ) + log A,} =<4 say 

frn 1 th u distr ! bution of 27 -Vy* is a (X-\ a 0 ») with n d f when H 0 is 
e. Hence if a is the size ol the critical region, then d 1 and d., 
are determined from 

1 Cd. 

J o (X 2 ) exp. {- I/2a„2 X 2 } dX 2 = * .. (3) 


•••(I) 


...( 2 ) 


%no 0 


and 


r(n/2) 

1 


n 


2 in °° r { n/2) 
which simplifies to 
— 1 f4/°0 2 

2 n/Z f(n/2) 


fd 2 (* 2 ) <n ' 2, ~ 1 exp {—i/2a 0 2 dX* = a ...(4) 


and 


J o l/<T ° 2 exp (-*x 2 ) i/x*-« 

--fJ/„ W '”- 1 exp (—*X») <//.*«« 

Hum J “ a/a ° 


2 n/2 f(n/2) 


• • (5) 


...( 6 ) 


The values d l /o 0 2 =\ l and d t /a a 2 = \ t can be determined by using 
the table of values of incomplete gama function. The power func¬ 
tion of the above test when //, is a=c, < (r n is 


1 

[ j| (/. J ) (n/2) 'exp {— 

2 ln °‘" r ( n/2) 

Jo 

1 

2" /iS A«/2) . 

^v (X 2 ) w 2) -' exp 

a 2 

_ 1 

^( X 1 )(«/ 2 )- , exp ( 

2 "" f(n/2) . 


0) 


Clearly, the power function when //, is rr=<n > a 0 is 

_!_ I" (X ,,(«/2)-l 

2"'* An/2) J A oo* ' 


exp {-iX 2 } JX 2 


...(S) 


M 

Ex. 6 . Assume that a random sample of size n is drawn from a 
Poisson populatiou with probability function 

-A . x 



270 


Mathematical Statistics 


Then the most powerful critical region of size not exceeding a far 
testing the hypothesis H 0 : A=A 0 against H x : A=A X is of the the 
form 

x a* if \> \ 

x^b a ifX 0 < Aj . 3 , - 

where x is the sample mean and a a and b a are constants. 

Our hypothesis provides : 

—A . x 

/(*|A)=* A 


and 


x ! 


Then L(x | Hi) 


exp (-nA,) /f _o, l,j=l,£." 

nxi ! 


The B.C. R. is given by 


^-“-I- 


S*J 


<A 


i.e. 


Then (27 Xj) log (A o /A0 < log [A a exp {n (Ao-A*)} 
rix log (Aft/Ai) < log Aai+n (A 0 —Aj) 

Choice I: A 0 > A x 

This implies that (XjXi) > 1 => log (A„/A t ) > 0 
Then the B.C.R is given by 

log Aa+n (A 0 -A,) 

^ n (log A 0 — log A t ) “* 

i.e. A 0 > A x => x < a a 

Choice II : A 0 < X v 

This implies that A 0 /A| < 1 => log (Aq/A^ < 0 
Then the BCR is given by 

- > log Aq+n (A t -A 0 ) _, 

^ n (log Ai —log A 0 ) 
i e. A 0 < Ax =£• x ^ b a . 

This is what we wished to show. 

Ex. 7. For the normal distribution with zero mean and variance 
a-, the best critical region forH 0 :o=a 0 against the alternative 
Hi : a—G\ is of the form 

27 .r S < a a for a 0 > oi 

and 27 *< a ^ b a far o 0 < where i=- 1, 2, n 

The power of the best critical region when o 0 > oi is 

F ("if, ) 

where is lower I00 a percent point and F is the distrilution 


a» 


say 




Tes s tf Hypothesis 


f(x ' ° )= ^7M «p (-£,) 


n 


Then L ( x | //,)= n /( } , } 

/= 1 


Hence — ** 1 

L V* | /y,; 


( c ,V (2n)) exp | 2c e * ^ X f | 

/== 0, I, y= 1,2,..., 


Then , log 

• a 0 2 — (Xj 2 

'■*' ^ JC/! ^ lo 8 ^«-n log 

Choice I „„ > ai- 


^ Xj‘ < 


This implies lhat the B.C.R. j s given by 

[ log A a -„ log sav 

. L ii 0 -~ar say 

Choice II : „„ < CT[ . 

This implies that the B.C.R. is given by 

jt I > [I ° E to « Woo)] 2 i a ^ r b, x , sav. 
given N b 7 ^ "* * implies ,,,a, ,he Si “ ° f " he critical region is 

from which, ~e obtain' 1 11 

^ Y ; r isa x, - v ^ "ii w,~ r, T rfy * and 
ow K 7w X '“r lower l00t,% poim - inen .” 

In view of definition of power’7,he test 

1 -p = P[x tu> i //,} 

= P \ S y, X ‘\ < ° a 1 ^AJ.whern j= I, 

= W —Oalr, [ 


272 


Mathematical Statistics 


p&j- < ■ 1 Hl 1 


e=P {Z Xj* ^ °0 2 ^- 2 «» » I 

< S - 1 l 

- ''{*■■ • % v - -1 

. * u * x, ~ is a X 2 —variate with n d f. 

noting that under H i9 ,s a * 

Therefore power of the test • x? *’ B ) 

This proves the result. 

Ex. 8. ^4 distribution is defined as follows : 

Then examine whether aB.C.R. exists for ****J h ?£j 
u • 0=0 oeainst the alternative hypothesis . »> o 

/o? (A« parameter 9 «/ r/ie distribution defined as above. 

Now n /^ 1 0) 

=(i+»r "[57^’ where j=l ’ 2, " n 

The B. C. R is given by ^ 

(‘ +W n (l+e °'” n (*7tV ’ 

from which, we find 

n log (1+W-M (*l+*t)»°» A+« «* 2 ( ; + to ^ w+w 


or 


:21 0g^-S) >l0gA+,,l0g 

• ■ • rl f*i±£o\ 

Now the test criterion is Z log \^J^ l 


Now we see that it is impossible to put this test c ^ iier ‘°“J“ 
the form of a function of the sample observations not epen g 
on the hypothesis 1 herefore there exists no B. 
case. 

Ex. 9. Define the distribution as follows : 

dF=pexp{-P{x-y)dx,x2*y,x<Y. 

Then for a hypothesis H 0 that P=Po,Y=r° and an alternative 


Tests of Hypothesis 


273 


hypothesis H x that £=0, t y=y u the B. C. R. is given by 

x= K zr K> { Ylfil ” y ° Po "4‘ ,og * +,og 

provided that the admissible hypothesis is restricted by the condition 
that y ,<y 0> Pi^Po 

By hypothesis, 

/(* I P> Y)=*P exp { —0 (x-y)}, for x^y 

= 0 otherwise. 


Then L {x | P, y}=n/(x i | p, y) = /3" exp {-p I {x,-y\ 

7=1 

*i» **«•••» 

= 0 otherwise, = 1, 2,..., n 
Then a B. C. K. is given by 

L {x | Hi) _ P t n exp { —p, 2: Uy—y l ) > . 

T (x | // 0 } ft," exp { — £ 0 27 (x,—y 0 )^ 

OF (p“)" CXP ^ ^ , “ y » )+Po Z l x *- Y o)}*>k 

or exp {— np x (x— Yi) + np 0 (x—y 0 )>(^-) A: 

Then taking logarithms on both sides gives : 

- "Pi (x-yO-f-np,, (x—y 0 )>« log -flog A: 

Pi 

or -n (p,-f 0 ) x+w iyxP v —YoPo)>n log -flog k. 

Pi 

from which, we find 


\Pi — fio) YoPo — ~ log k -f-log 

n P o 

Then x< ^ |rifo-roP 0 —~ log A:-flog j^-1 

provided that pi >fi 0 
The result follows. 

Theorem 1 The power of a BCR for testing a single hypo¬ 
thesis against a simple hypothesis is never less than its size. 

Proof. Assume that w ti is a BCR of size a for testing a simple 
hypothesis //« against a simple alternative H x . Then 


L 


L (x | H t} )= a 


-.(I) 


Neyman-Pearson Lemma demands that 

L ,x I //„) 


L (x | H x ) 


inside wq, 



274 


Mathematical Statistics 


from which we obtain 

A. f L (x | dx^ f L [x | H 0 ) dx, 

J Ct> 0 J ^0 

which, in view of f L (x J H x ) dx=l—P, 

J^o 

implies that 

A a (l-P)^« 

Again Ncyman-Pearson Lemma provides : 

A a L (x | m)<L (x | //<>), outside a» 0 

or inside &/a> 0 where SI is the sample space. 

Then we have 

so that A a 0<l—<*. 

which implies 

l-cc>A a P 

(2) in conjunction with (3) leads to obtain 

A« (l-a)(l-j3)^A a a/3, 

which simplifies to 

1 —/3—a>0. 

This provides : l-P>o. 

This completes the proof. 

Power function of the two sided F-test 

Define r a and d a such that 


...(2) 


...(3) 


0 


% (F) dF 


=«/2= ^ 


CO 


g (F) dF 


where g ( F) dF is the probability density function of a random 
variable F with n x and n 2 d. f. 

As a matter of fact, Xj 2 and X 2 2 are (Xj 2 , Oi 2 ) and (X 2 2 , o 2 ~) 
variables respectively, and we assume that they are independent 

with ni and n a d. f. Leto,=o,. Then follows Snede- 

cor’s F distribution with n x and n 2 d f. We accept the hypothesis 
<r 1 = Ga when it is infact wrong if r a <F<</ a * Evidently, 

£ = P {Ca^F^d* | o^ai} •••(*) 

where f} is the error of the second kind. On letting offiot, we 
have 

y 

X 2 * n k a x 


!!L^L = f. 2 . 2 i s Snedecor’s /"variable. Therefore, 


p r .j "2 _ Z? G 2 

— 5 * 


Tests of Hypothesis 


275 




, <J 2 * 
d * ~s 




g ( F) dF. 


This implies the power function of the two-sided F-test is 


-*- i S 


d 

C/ a 


<*1 


2 


g (F) dF 


...( 2 ) 


Cz — 2 

Ol 


Exercises 


1 Explain the following terms : (i) Errors of the first and 
second kinds (ii) the best critical region, ^iii) power function of a 
test (iv) level of significance and (v) Simple hypothesis 

(Bombay 68; Poona 60; Meerut 69) 

2. State and prove Neyman Pearson Lemma for testing a 
simple hypothesis against a simple alternative. 

(Kanpur 68; Mysore 67; Delhi 69; Bengalore 68; Benaras 68) 

3. Derive a most powerful test of the hypothesis 0 = ± against 
the alternative 6=\ for the parameter 6 in a geometrical distri¬ 
bution 0 (1—0) x , x=0 , 1, 2,..., based on a random sample size 2. 

(Mysore 67) 

4 . Suppose /(x, 0) = (l-f d) x\ 0<x^l, 0>0 and/(.v, 0)=-O 
elsewhere. If the hypothesis H a : 0=1 is to be tested by taking a 
single observation on x and using the interval x<*5 as the critical 
region. Compute (i) the size of the type I error, (ii) the probabi¬ 
lity of determining that H 0 is false if the true value of 6 is 2, and 
find the power function. 

[Hint. ««[[(!+*) x*dx l j =0 25, 


P (reject H 0 | 0=2)~P (jc<0-5 | 0=2)=f'\l+0) x* dx 


= 0*625 

Power function = P (reject H 0 | H X )=P (*<0-5 | H) 

(1+0) x* dx=(O 5) l+0 . 

0 




I 

i 0=1 


5. A random variable X follows the following exponential 



276 


Mathemitical Statistics 


distribut’on : f (x, 0)=(\/0) exp (—xjd), O^x^o c, 0=0. H 0 : 0=6 
is rejected and Hi : 0= 12 is accepted. If an observation selected 
at random takes the value 18 or more. Compute the critical 

region and the size of the two types of errors. 

[Hint. SI is (K*<oc, (critical region is*^18 and Sl/<o 

is x< 18. <x=P {x e to I H 0 ) = P{x^l8 | Ho) 

= P{x^\8\d=6)=r f exp (-x(b) dx J 0=6 =«~* 

J 18 

P—P {*<18 | 0=12}=J’ 8 Y txp(-x/S\dx j 9=12 =^ 3 " ] 

6. A binomial distribution :/<*, p)=3c*/>V“* x=0, 1, 2, 3, 
and/(.v, p)=0 elsewhere- Now we wish to test H 0 :p = p () =i 
against p=P\>\ by agreeing to accept H 0 if x^l in three trials 
and to accept hi otherwise. Then the probability of committing 
(a) type I error and (b) type 11 error are respectively 10/4 3 and 

0-5. 

[Hint, a =P {reject h 0 I H 0 is truc}-P { x> 1 I p=\) 

3 

= S *c* («*(!)*“*-10/4“* 

x=2 

p=P (accept H n | Hi is tiue} = 7 s I | p=\) 

1 

=» 2 'e* (h) x (i) w =0-5j 

x=0 

7. Find the most powerful critical region for testing simple 
hypothesis //„ : 0=0 O against the alternative simple hyposhesis 
/y l : e=d x where the random variable X has a p. d. f : 

/(.v, &)=(l/v / {(2r.)} exp {—$ (x-0,*}, -co<x<co. 

8. Simple hypothesis H 0 : 0=2 is tested against the alter¬ 
native //, : 0=1 by means of a single observation from a 
distribution: /(x, 0)=0 exp (— 0x), x^O, 0^0. Compute the 
best critical region, using the Neyman-Pearson lemma for 
a = 0*135, and also find the power of the test. 

f oo 

xe~ m dx 

= 0 368, where k = \ is given by j 2e~" x dx= 0135 



Define a frequency function as follows : 

fix , <?) = |t 

[0 elsewhere 


O<x<0 


Tests of Hypothesis 


111 


For ttsting II 0 : 0=1 against H x : 6=2 by means of a single 
observed value of x % what would be the siz s of the type I and 11 
errors in the intervals (a) 05<x and (b) /<x^/\J as the critical 
regions. (Agra B. Sc. 72, 70) 

By hypothesis, H 0 : 0 = 1, H x : 0,= 2 and 


f(x, 9)=.j e •°< x ^ e 

0 elsewhere 
(a) Now co^x^O-5, SI\oj=x<0 5 

a = F {x € w | H 0 } = P {x e w | 6=\} = P {x^O 5 l 6 

“Lt*! <fc -°- s 

P = /> {x<0*5 | 0=2} 

rO-6 1 | ro.b 

= J„ T dx \e~2=L *^= 0 ’ 5 


(b) Here oj is [1, 1-5} 

<X- = P {x € [1, 15] 1 0=1} 
f 1 ' 6 I 

= 1 /(x, 0) f/x | j =0, since by 

definition f (x, 0)=O 

P = P {x € | //,}=/> {x € | 0=2} 

-£ V (, V “*\ e =2 




18 

THE CALCULUS OF FINITE DIFFERENCES 

181 Introduction. The calculus of finite differences plays 
an important role in our daily life. It has been of great use for 
mathematicians and was originated by Sir Issac Newton. 

Assume that >=/ (x) is a function defined for any value of 
the independent variable x, and further assume that/(*,), where 
Xj—o+jh (j=0, 1, /, 3,...), is the corresponding value of the 
dependent variable y=f(x). Then Xj=u+jh are called the 
arguments and/ {xj)=f (a-\-jh) is called the entry corresponding 
to the argument xj=a+jh where h denotes the interval of differ¬ 
encing. 

1 he basic assumption for the application of the calculus of 
finite differences is that the observations should be expressible or can 
be expressed as a polynomial of certain degree. 

18*2. Operator A 

Definition. The operator A is defined by 

A/(x)=/(x+/0-/(x) ...(1) 

where /(x-f h) denotes the entry at the argument x+h. 

In fact. A/(x) is called the first difference. The second 
difference of/ (x) is denoted by A*/(*). Then 

**/(*)-* (A/(*))=A (/(*+/,)-/(*)) 

= A/(x+/,)- A/(x) 

=/ (x+ 2 h)-f {x+h)-[f (x-b/0-/ (*)] 
=/(x-1-2/j)— 2/ (x+h)+f (x) ..(2) 

The third ditference of / (x) is denoted by A 3 /(x). Then 

tff(x)=b(&f(x)) 

= A ( f (x 4- 2//)—2 / (x -f h) +) (x)) 

= A /(x + 2/i)- 2A/ (X4-/0 +A/lx) 

=/ (x-f 3/;)—/(x-f- 2/i)-2 (f{x+2h)-f (x+h)) 
+/(x+/0-/(x) 

=/ (x+ 3h )-3 / (x+2/i) + 3/ (x+/0-/(x) ...(3) 


and so on. 



The Calculus of Finite Differences 


279 


The nth difference of/(x) is denoted by A"/(.*). Then we 


find 


A "/(*)=/ (*+«A)-(i) / (*+ /l _i A )+ ( 2 ) 2^) 

-.+. ~f{x) ...(4) 

The differences A /(*), A 2 /A n f (x) are shown in the 
following table which we call the difference table. 

Finite Differences Table 


Argument 

X 

Entry 

fix) 

First 

difference 
A fix) 

second 

diff. 

A 2 fix) , 

Third 

diff. 

! A 3 / .x) 

! Fourtn 

1 diff. 

A 1 / (.v) 

a 

fia) 







A/(a) 




a+h 

fia + h) 


A 2 / (a) 





A fia+b) 


A> f (a) 


a-\-2h 

f ia + 2h) 


A 2 /(fl + /t) 


A 4 f ia) 



A/ (a + 2h ) 


A 3 /(a+/0 


a + 3/i 

f (a+3/0 


A 2 /(a-b 2/0 





A/(0 + 36y 




a-f-4/i 

/(a + 4//) 






Remarks. 1. The first entry /(a) is called the leading terra 
and the differences A/(a). A*/ (a), A 3 /(a),...arc termed as the 
leading differences. 

2. The reader should keep in mind that A is not a quantity 
but a symbol standing for an ' operation ’. Hence J- does not mean 
the square of a quantity A but means that the operator A is to be 
performed twice. 

3. The differences of any order can be expressed in terms of 
the entries. This is evident from equations (2), (3) and (4). 

Illustrative Examples 

Ex. 1 . Let u x = 2x 3 -x 2 -\-3x-\-l. Then compute the values of 
y corresponding to x=>0, 1 , 2, 3, 4, 5 and form the table of differ¬ 
ences. Establish that A 2 u x =l2x+10. Verify this numerically. 

[Madras University B. Sc. 60] 

By hypothesis, 

Wx ~2* 3 -jc* + 3x+l 

Then w 0 =l, Wi = 5, w,= 19, u a = 55, w 4 = l-’5 and w fc = 241. 




280 


V 


Mathematical Statistics 


The difference table is as follows : 



A Ux=Ux+\—Ux 

=2 (x+l)»-(*+l)*+3 (x+lM-l-(2x*-~* 2 +3x+l) 

«6x 2 +4x+4 

and d 2 u r = 6 (x+l) 2 +4 (*+l)+4-(6x 2 +4x+4) 

«l2x+10. 

Totting x=0, 1, 2, 3 gives 

A % w 0 =IO, J*m 1 =12+ 10=22, 
d 2 w 3 = 12*2-1-10=34, A* w 3 = 12*3 + 10=46 
These values agree with those computed in - the difference 
table. 

Ex. 2. Compute A tatr 1 x, the interval of differencing being 
h. [Meerut B. Sc. 1971; D. U. B. Sc. .Hons); P. U. (Hons) 57] 
Definition of A : A u x =u x+h —u„ gives : 

A tan -1 x=tan _1 (x+A)—tan“ x x 

=,arr ‘ i X +i f hU ( sinc * ,an_1 '*- ,an ' 1 ■ s=,aD "‘ 

“‘ an ' 1 1+w - 

Ex. 3. Find : 

(1) Ae*, A*e x , A n e x when the interval of differencing 

is h. 

(2) de°* +6 , d 2 e" + * an( j w h en |h c interval of 

diffeeencing is unity. 

(3) d 3 (1— x) (1—2x) (1—3x) when the interval of 
differencing is unity. 


281 


The Calculus of Finite Differences 

(4) A (u x v x ) when the interval of differencing is 
unity. 

(0 Definition of A ; Au x =u m+ f l — u x gives 

Ae r —e x+fl ~e*=e x (e*—I). 

Again A 2 e x =A ( Ae x )=A {t* (e h — ))} 

=(e h — l)Ae*=(e h — 1) e* (e*-l) 

*=(e h — l) 2 e*. 

Similarly, A 3 e x =(e h — l) 3 e*. 

Proceeding similarly leads to find 

A n t*=(e h —\) n e*. 

(ii) In view of definition of A, we have 

Ae ax +* > = g a (*+!)& — i7axJ^.b 

=e ax+b ( e a — 1); 

^J2 gax+b = gax+b a _|^2 

••• ••• ••• ••• 


A n c<*x+b — € ax+b (ga _J 

(iii) Now (\-x) (1- x) (1—3*)= 1-6* + lU*-6*» 

Then A (1-*)(1-2x)(l-3*) 

= l-6(x+l)-f-ll (x -f- 1 ) 2 —6 (*+ 1 ) 2 —(l-6;r+!lje«-6*») 

= — 6-f-ll (2x+l)-6 (l+3A--f3.v-)=l +4x-I8.v- 
A2 (1 -x)( 1- 2*)(l-3*) = -l+4 (r+l)-18 (jr+1)* 

— (— I -f- 4.V-8.T 2 ) 

=4-18(2jc + 1)=3^-.14 

and hence A3(l — jr)(l — 2x)(l — 3*) 

= -36(*-fl)- 1 4-(—36.x— 14) 

= -36 

(iv) A (u„V x ) = Ur+ 1 Vx+1 — U x V\- 

= ,/ *+l v *+l — Ux+l IVr + l/x+iV*- U„V X 

= Wy+1 (l’r+i IV)+l’v(//>+| — Hr) 

= u x+1 A V x -4- v x A Ur 

Similarly, A (u M v x )=u x Av v -f v v+1 A//, 

(4) A lvgf(x) = log | 

[DU H. Sc. (Hons), 69] 

Definition of A leads to find 

A logf(x) = log/( t x+/,)-\og/(x)=:log yj~ 

But A f( x )=f(x+h)-f(x). 



282 


Mathematical Statistics 


from which we find 

/(*+*)=/(*)+*/« 

««. 4 k, /«- 

completing the proof. 

Ex. 5. Find A /(x) u»Ae#i 

the interval of differencing being unity. 

1 _ 1_ 

Obviously, A/W ^+,^+3 (*'+1)4-2 x*+3x+2 

1_ 1 

x--f 5x4-6 x*4-3x+2 

1 » = ~2 _ 

(* + 2)(.v+3) (x 4-1K*4-2) (.Y +1 )(x-r 2)(x-f 3) 

18'3. Fundamental Theorom of Finite Differences. 

Theorem 1. Assume that f (.v) is a polynomial of nth degree. 
Then A* f(x)=constan tforj=n 

= 0 for j>n 

Alternatively the nth order diffeience of a polynomal of nth 
degree is constant and higher dffe/ences are zeJO. 

Proof. To prove this fact, assume that/(x) is a polynomial 
of mh degree snch that 

/(*)= a 0 x*+aix n ~ l + 0 aX n ~ a -f ...a„-iX+a n , (a o y^0) 
where a, 0=0, 1, 2,..., n) are constants. 

Definition of A gives : 

A/(x)=/{x+/»)-/(*) 

= [fl 0 (x-4-A) n -f-n 1 (x+A)"' 1 +a i (x+h) n ~ i + . 

+ a M _i(x+ h) + a n )- [fl 0 x"+a 2 x "- 2 +. 

-f fln-jX-f-flJ 

-[<*•+( " ) *-> /- + ( 2 ) • x ”' 3 * 2 +...+A"} 

+«, {.»-*+( " ) -v»-» A+( ” ).v"-W+...+ /■'•"} 

... ••• ••• 

H-fl M _ s (x 4 4- 2x/i+/» 2 ) + 0 „_i(x-f /») + a*) 

- [o 0 X" 4 flj X"- 1 4- 0 2 x"- 2 + • • • ■+ an-1* +0/. 

«= a Q nhx n ~ l 4- b 2 x n ~ 2 4* bjX n ~ i 4...4^-i- 1f i^» 




283 


The Calculus of Finite Differnces 

where bj (j=2, 3, ..., n) are constants independent of x. 

This shows that A f(x) is a polynomial of (n— l)th degree, 
Hence the first order d.fference of a polynomial of nth degree is a 
polynomial of degree (/i — 1) 

Again the definition of A provides : 

A7(x)=A(A/(x)) 

=[oq nh y x+h) n ~ l +b 2 (x -f h)"-* + ...+b n _ 1 (x+h)+b n ] 

- [a^hx"-' -f b % x »-*+... +b n ^x + b n ] 

— [a Q nh (x"' 1 -^ 1 j x n - 2 h+...+h’'-') 

+ b 2 (*•“*+( n + 2 ) x"~ 3 A-f-...+/*"-*) 

4- b n _ x (x+h) + b n \— [, a u nhx + b* x"" a +... + b n _ x x 4- b n ] 
■=n 0 «(»—!) h 2 x n ~ 2 + c z x n -*+c A x n -*+...+c n _ l x+c n 
where Cj (j= 3, 4,..., n) are constants independent of x. 

That is, A* f(x) is a polynomial of (n —2)th degree. The con- j 
tinuation of this procedure leads to find 

A n /(x)=fl 0 n(n— l)(n— 2)...3.2.1 x n-B 

= ! /i", 

which is a constant, say A. 

This proves that the nth order difference of a polonomial of 
the nth degree is constant. 

Now A" +1 f(x) =A (A m /(a')=JA=A—A=* 0 
and A B +* f(x) =A "+ 3 /(x)=...=0. 

This finishes the proof, 

Remarks. 

1. A"/(x)=A" la 0 x ,, +a l x n - l + ...+a n - l x+a„ ] 

= A n (a 0 x H )+a l A" x"~* + ... 4- o n . x A" x-fA" a n 
<=A n (a 0 x n )=o 0 n I h n 

2. Letting h = l gives : A n f(x)=A n (a Q x n ) = a v n I 

Evidently A n x n =*n ! 

Example 

Evaluate : (a) A n [ax n -\-bx n ~ l + cx n 2 J 

(b) J 3 [(1 —x)( 1 — ?x)( 1 — 3x)J 

(c) Jio [(\-ax){\-bx*)(\-cx*)(\-(lx A )\ 

(B. A. Punjhb 1950J 

where the interval of differencing is unity 

(a) A n [ax n + bx n - l -\-cx n ~ 2 )=A n (ax n )+A n (bx u ~ l ) + A n (rx^-j 1 

=a J“x"-f -b A n x n ~ l + c A n x n ~ % 




284 


Mathematical Statistics 


=an ! -f A.O-fc.O, by theorem (1) 

—an ! 

(b) [(1 —a:)(I —2.vKl —3-v)] 

=d 3 [(—x)(—2 .y)(— 3.v)]=J 3 (-6**) 

=—6d 3 x 3 = —6 3 ! =—36 

(c) Jio [ ( 1 -ax)( 1 — 6.v 2 )( 1 — cx a )( 1 = dx 4 )] 

=J 10 {{-ax.i-bx^-cx'K-dx*)] 

—A 10 (abed x 10 )=abcd A 10 x l0 =abcd. 10 ! 

18 4. Operator p 

Definition. The backward differences, usually denoted by p 
(read as nabla), are defined as follows : 

Vftx)=f(x)-f(x-h) 

then (7 f(x +//)=/(*+ h) -f(x) = Af(x ) 

This implies that the backward difference of/(.x-}-/ j) is the 
same as the forward difference of /(a). The backward difference 
table is as follows : 


Arguments 

X 

1 Entry 

f(x) 

Vf(*) 

^f(x) 

fV(.y) 

P 4 

a ! 

Ax) 



1 




vAa+h) 


! 


a + h 

f{a + h) 


J 2 f(a+2h) 





Vf(.a+lh) 


PV(tf+3A) 


o-f2/r ; 

Aa+2b) 


p* f(a+3h) 


P 4 (a-f 4A) 



r/(fl+3A) 


P 3 /(fl+4/l) 


o-f-3/t 

/*(<! +3 A) 


r 2 /(<*+ A) 





p/(a+4/j) 




a+4h '■ 

1 f{a 4-4/;) 






18 5. Operator E 

Definition. The operator E is defined by 

£/(*) =/(* + //) 
or Eu A =u t + h 

where h denotes an increase in the argument x. 

The operator E 2 implies that the operator E is to be perfor- 
inei twice. Then 

E 2 f(x)—E{E f(x)} = E Ax +/;) =/(*+2 A) 

In general, E i f x) = f(x -} -j ! i) 

This esult is of great importance. 



The Calculus of Finite Deffererices 


285 


Remarks Assume the interval of differencing in unity, i.e. 
h~ ]. Then we find 

£'/(*)=/(*+;) 

Particularly E f{ 0) =/( 1 + 0) =/(I), E* /(1) =/(l + 2) 

=/(3), £•/(- i )=/( - 1 + 3) =/(2), 

^ #W -2 = w -?+5 = w 3 . u 2 — E 2 U 0 = £i/ x = £ 3 //_, 


Conversely, /(l) = £/(0) = £ 2 /(-1) 

or u x = £«„ = EUt-i, 

U2~ £*U 0 =* £*l/ x = E~U_i 

u 3 = £ 3 w„ «= £ 2 //, = £w 2 = £*//_, 

and so on. 

Theorem 2 A = E— I or £= I + J ...(1) 

Proof. By definition of J, we find 


A f(x) =/ .x+h) -/(a) 
fn view of definition of £, we have 


f{x+h) = Ef{x) 

Hence J/(x)=£/(x)-/( a) =(£-!) /(a) 

But/(^ is aroitrary. Hence 

A=E- 1. or £=1-M 
This proves the theorem. 

(1) is a very important relation between operators E and J. 

Theorem 3. £-i=l — p or E— (1 — p)~‘ 

Proof. Definition of p gives : 

P /(a + £)=/( a -f //) -/(a) 
so that f(x) = f(x -f //)- p A-v + /i) 

Then £-V(A+/0=/(A-f//)-p/fAH-//) 

= (l-J)/(AT/lj. 

Thus £-i = l _p 

3r £=(1 —p) -1 , completing the proof. 

Properties of A and E 


Theorem 4 The operators E and A are convnutative in opera¬ 
tion with regard to constants. 

Proof. Assume A is a constont. Then 
A (A/ (a) = A f(x+h)- A f(x) 

=• A { f(x -f //) — f( *)} =» A Af( t) 
and £{A /(*) = A /(x-f-/i) =A£ / a). 

This completes the proof. 

Theorem 5. The operators E and A are distributive. 

Proof. Now A {f(x)+g (a)} =/(a -f h) Eg (x + h) -{/ a) f g(x)} 



286 


Mathematical Statistics 


=[Rx+h)-f(x))+[g (x+ h )~g (*» 

=d/(x) + J g(x). 

Also E [f{x)+g(x))=f(x+h)+g (x+h) 

= Ef{x)+Eg (.x) 

The proof is terminated. 

Theorem 6. The operators A and E arc linear , /. e. */ A and n 
constants , then 

A (A/(x)+/ig(x))=A J/(x)+j*Jg(x) 

£ (A/(x)+/a$(x})=A Ef(x)+p-Eg K x) 

Proof. We have 

4 (A/(Jc)+ft^(x))-A/(x+A)+/ig(x+A) 

-(A f(x)+*tg'x)) 

=A (/tx+A)-/lx))+rt*(*+AWM> 
=A d/(x)+/x J g(x) 
and £(A/(x)+/xg(x)) 

=A/(x+A)+/ig'(x+A) 

=A £/(x)+fiE|ffx). 

This terminates the proof. 

Theorem. 7. E and A obey the laws of indices , i.e. 

UE>Ax)=B+>f{x) 
and A A* = A'+l fyx) 


Proof. Now £' &Ax)=& (£> f(x)) = E i fx+jh) 

=f(x +jh+ih)=f (x+h (i+j))=E i+l f(x) 
and A 1 A> /(.v)=(£-1)‘ (£- iy/*) 

=[{P-(‘ Cl ) £<-‘+('c.) £-*-+(-)'}{£'-('O f ’ 

...+(-y}]/(*) 

={£»+> - {VO+(V,)} £'♦'-■+{(' f 2 )+(V0+(’c.)('*i)} £ ,+J - 2 

-{('O+(^a)+(‘qX'cO+^XV,)} £^- a ...+(-)' +1 l/W 

Note that 

( , ci)+(v 1 )=i+y=(/+yc 1 ), 

(‘f .>+('<■.)+(V,)(V 1 )= < ( ' 2 ~ 1 ) + J -^=P+ij 

= < *—i+j 3 —j + 2ij _ (»+j) a -(i+j ) 

(^ 3 ) + ( ; c 3 ) +('r x ) ( J c 2 )+ (*c 8 )(V a ) = (i +jc s ), 
on simplification and soon. 

Hence d'/(x) = £ £*+'-£/H-i + ^Jj £/+#-* 


The Calculus of Finite Differences 


287 


“('l') £,+, ' s +-+(-)' + ']/(*) 

T .. a u 
This ends the proof. 

Remarks 1. The operator E and A are not comrnutative with 
regard to a variable, i. e. 

E (f (x) g (xtef (x) E g (a) 
aud * (/ (x) g (x)) (x) Ag (. x ). 

Proof. Now E (f(x) g{x))=f (x+h) g(x+h) = Ef(x)-E g(x) 
and A f(x) g{x)=f (x+h) g(x +/»)-/(*) g(x) 

=(Ax + h)-f(x))g(x + h) +/(*) (g(x + h)-g(x)) 
—g(x-\-h) Af ( x)-\-f(x ) Ag(x). 

2. The operator A, operated on the quotient, provides : 

A jAx)\ = Ax + h) L _f{x) = f(x+h g (x ) -g( X + h)f(x) 

\g(x)J g(x + h) g(x) g(x+h) g(x) 

_( /(* + //) -Ax) g( x )-f (x) (g{x + h)-g(x)) 

g(x+h)g(x) 

_ g(x) Af(x)-Ax) Ag>x) 
g{x)-g(x+h) 

3. A (constatit) = 0 
For, let /(a)=A. 

Then A /(*) =/(x-f A) —/^)=A—A = 0, 

implying that (constant) = 0 

4. J"/(*) = £ ( " )f(x+jh) 

This result is termed as the ‘Lagrange form of the difference \ 
by means of which we may compute the mh difference between 
completing the whole table. 

Proof. Now A* f(x)=(E— 1 )"/(*) 

= [ ( ” ) & (—) n ~ J J f(x) t by Binomial expansion 

= £ ) ERx)^S ( ” )/(*+,/,) 

This finishes the proof. 

5. A Ax) = 0=> either A =0 or f(x) = 0. 

6 Operators E and A can not stand without operand. 

Relation between E aud D where D = —. 

dx 



288 


Mathematical Statistics 


Definition of £ leads to find 
Ef{x)=f{x+h) 

=/(*)+/>/ \x)+Y~\f\ x )+j i rw+~. 

by Taylor’s Theorem 

=f[ X )+hDf{x)+j j .. 

where D-t 




= 1+AZ>+ 


(hDY , {hDf , 


2 ! 


3 ! 


]/(*) 


//!> /V X 

=e f(x). 


This irrplies that £=l+4=e 

Assuming the interval of differencing as unity, we find 


, 0 
£=l+4=e , 

from which we obtain 

Z) = log (l + 4)=4-+4 2 -H4 3 -... 

18 6. Method of Seporation of Variables 

There exists a relation : £=1 + 4 between the operators !? 
and A. This relation allows us to prove a number of useful 
identities. The method based on the relation £=1+4 is known 
asth e method of separation of variables. The method is clearly 
explained in the following examples. 

Illustrative Examples 


Ex. 1. 

(Agra B.Sc. 66, M.Sc. 59 ; Banglore B.Sc. 66; 

Delhi B.Sc. 67; Banglore B.Sc. 68) 


Proof. Now 


=u,x+ W 8 X 2 + U 3 X 8 +... 

=xw 1 +;t 2 £w 1 +x 3 £ i£ w 1 +... since u m + h =E t ' u *- 

=x [1 +.x£+x 2 £ 2 +...] 

=x (ra 

where r~xE 


) 


u , 


since l+r+r 2 +...= 


1 —r 



\ _'— 

[l— x-xA 



replacing £by 1+4 





The Calculus of Finite Differences 


289 


° X [l-x—xA ] "> 

= I “ l 

= i^[ 1+ I^+(T^oi+-] 


i—x L l -* 1 (1-*)*^ —J 1 

°T=7 U ‘ + (T^?" 1 " ,+ (T^jr+-= RHS 

This proves the identity. 

Ex. 2 A n yx=yx+ n — 1 'c x y v + n -i+*c i y x+n ^+...+(-\)ny X ' 

_ (M. A. Delhi 60) 

Proof. Now A n y x *=(E— 1)* y x 

since £*=l-j-j 

=[E n - n c 1 E n - 1 +"c*£>-2+...+(— 1 

by means of Binominal expansion. 

= E n y x - *c x E"~ l >v-f-...+(— 1 '^.jv 

“AVf-n •••+(— l) n JV 

. r . . _ ( since E''y*=y*+*) 

This finishes the proof. 

Ex. 3. u o—Ui-\-Ui—... = \u q — \Au x + IA-u q —... 

Now L.H.S.»(i/ 0 -£i# 0 +E*i/ 0 -£» Wo + „. ) 

since u n =E n u Q 

=1 _ £+ ^_ £3+ .. 0 „ o= _>_„ o 
= T+d“ Wo= * ( I + T/ 

, f . A A» A 3 I 

M 1 2 + 4 _ ”T + - 1,0 


i r i ^ 

= 4 1 - 2-+4 

w« 1.1 


+ •••1 Wo 


This is what we wished to show 
Ex. 4. u x — —A^ux-i+r^r- A*u 


~2 ~~4 Au °^~s A2u °~T6 J3w o-f.-« 


1.3 .. 1.3.5 . 6 

SJ^A Ux _ 2 - A Ur. 


— idu x+1 / t -\- J A 2 u x+ i ri — i d 3 W r+1/a -f-... 

Now L.H.S. =w,—i. d 3 .£ , -»t/ JC -f- g - r ^l J<£- 2 W , 


1.3.5 
8 s 1 2.3 


d a £-- 3 i/,+ ... 



290 


Mathematical Statistics 


=«x-l ■ (Jg) 


UX+ 


(-i ) (—i-D f 4H\ % 

\4EJ 


1.2 


(-i) i-i-mdkM&Y 


1.2.3 


© 8 “‘ + " 




-i (i-M/41V 


4£ 


2 ! 




/ A* \~ ll% 

=( 1+ 4£) “ x 

/4£-f/l a \“ 1/2 

-(l+4«/4£r ,n “*=( 4E ~) “ r 

_( 4 d+g+ 4 ‘ j' 11 ’"- 

-h^K 

-faSrfV** fcW; # , 

«=£‘f 2 ( 1+yj u»=£ 1,s ^ 1—■j'+fi-) “ r 

=Mx +1 / a —i^Wx+l/a+i^ 2 Mr+l/a- 

This terminates the proof. 

Ex. 5. Sum to n terms the series : 

2.3/l a x"+3.4/l 8 X"—4.54 4 x*-K« 

Now =1.2Jx"-2.32l 2 x"+3.4zlV-4.54 4 x 0 +... . 

= 2/1 [1 — 3/1 -|-3.2/l a —2.54 8 +...] x" 


_/4£±4Lr*i 


4£ 


2/1 


=2J [l+4J-» x"=2 (£-l).£- s x»=2 (E- 2 —E~‘) x* 

=2 [(jc— 2)“—(x—3)»]. 

Es. 6. Evaluate : x*. 

(B.A. Lockoow 62, B.Sc. Agra 59) 

N „ ,.,(<-!£.) ,-(£#!) »■ 

=( £-2+i] x 3 =(x—l) a —2x 3 H-(vX—-I) 3 

=x*+3x a +3x-H-2x 3 +x 3 -3x 2 +3x-l=6x. 

Ex. 7. Prove that 

/J*\ £e* 

/l a e* 

where the interval of differencing is h. 


•••] 



The Calculus of Finite D if fences 


291 


Now R.H.S.=J 2 . E-'e*. 


)X+h 


A.e* (e*—l) 


A *£“' e\ pHTIy e x since d 2 e x =(e h — l) 2 *x 

e h 


=ja E ~i e* . 


(^-1) 


= J2 


1 


£1 !?-w =ex 




00 


00 


«o+^-;+yf+5 2 p+... 


—l) 2 - ~ (e*— 1)* 

Ex. 8. Apply the method of seperation of variables to esta¬ 
blish : 

u o+ u i -f-...+u*=" +1 c 1 w 0 + n+1 c, J 0 f 'o+ n+I c 3 zl 9 w 0 

+ ••• + A n u 0 

(Vrk. U. 67; Mysore 67) 

UjX , w,x 2 t W3X 3 
1 ! 

=e« [i/ 0 +jtJu„+|i /IX+~ ... 1 

(Krk. 68; Vikram 60; Benglore 69; Mysore 67; Delhi 69) 

(c) Ux — Ux+x + Ux+x —W -+8 +... 

= h ^-1/a—B A 2 Wx-3/2 + ^ (£) 2 j 4 Wr-5/2 

-jf (I)' «—T.. + 

(Poona 67; Lucknow 67, 60; Agra 67) 

(d) Ax m —\ A*x m +1~ A 3 x m — A*x m -\-„.m terms 


(*+*>”-(*-*) m . 


For, 


(Lucknow 67; Vikram 61) 


( a ) w o+w 1 + ...+w n =Wo+£w 0 +£’ 2 v 0 -f-...-|-£ v *t/ 0 

=(i+£:+^+...+£»=(^5 1 ‘)« 0 
(i + ^)» + >-J ]</„ 

“■j[ 1 + 1 ” +,c » J +“ +,<, » ^ a + - 

-i w+, C 1 M 0 + n+2 C a ^Wo-h n+1 ^s J 2 Wo+***+^ n "o‘ 



292 


Mathematical Statistics 


This proves (a) 


u x x , w 2 x 2 u z x- 


(b) Now W o+^ ! +fr + 3T + - 


.8 


=M 0 -j-x£i/ 0 +2~! **+n 

.{ 1+ , £+ « + «2 + -.|5.. 

-e»^ l + xd+tef + .J 


xJ 


M 0 


2 ! 
x 2 


•••] 


=e*|^l+xjM 0 +j-y J 2 m 0 +3~j 4 3u o + 

This proves (b). 

(c) Now I 2 ~, (a) 2 4 4tf *-6/a 

= | ^ £->(>-J ^l=£- 3 '*+^ (i) a d*£"*W—*" ]»* 
=i [ 1-i J*£“+^ U)‘ 


My 


= [4 (1 + J)+^ a ]* 1/8 m*=[(2+ J) 2 ]“ 1/2 m, 

= [1+(1+J))- 1 I/« = (H-£) _1 My 
= (l-£+£ 2 -£ 3 +...) My 

= M, — Mr+x + Mt+a-j - ••• 

This proves (c) 

(d) The fact that each j > m implies A'x m = 0 asserts that 
the sum of the series upto m terms remains the same as the sum 
tends to infinity. 

Hence Ax m -\A i x m + 1 ^ A'x”- J 4 x"*+... m terms 


A 1,3 A*- X JA- + terms 1 x 


2 1 2.4 


2.4.6 


.„[i- 

=A [l+(-lH ( - - } 2 ( , f) J» 

-f (~ (~ 3 , / - ) - ( - ~ 5 / 2 - ) J3-}-.,.up to infinity Jx m 



The Calculus Finite of Differences 


293 


u 


= A x"'= j£--l/2 X m 

= J + 

Ihis proves (d). 

uX m ‘.X ff,h order dfferences ofu ' ° re 

— 4r-f - 5 ( c ~6)+3 (a—c) 

2£ ~ ^256 

where Qt=u 0 \-u it b=u l + u i , c=w a + w 3 , 

[Agra B. Sc. 68. 61, Delhi 61, Luckoow 66, Mysore 671 
In view of definition of E : EJu x =u x+ , y 67 

we find 

= l/ * li== +J) 5 ' 2 a 0 

"f 1 ■+ 5/2J +1 / 2 ! • 5/2 • 3/2 J* + +1/3 ! . 5 / 2 .3/2. 1 / 2 J • 

+ 5/2 3/2.1/2 (— 1/^). 1/4 ! J 4 +5/2.3/2 
, . . x 1/2 (— 1 /2)( —3/2) 1/5 ! 

thesis)' 006 the fifth ° rder dlfferences of are constant b y hypo- 

= ^0+ 15/8d-w 0 -1-5/2Jir„+ 15/16 d 3 w 0 — 5/ 28 d 4 w 0 -f- /256 J 4 */ 

= + 5/2 (£- 1 )u 0 + 15/8 (£- I )X + 5/16 (£-1 i* U(l ° 

— 5/1 (£— J) 4 w 0 + 3/256 (£ —1)* w 

= w 0 + 5/2 (i/ 2 —1/ 0 )+15/8 (E*-2£+l) w 0 + 5/!6 (£»- 3 £* 

+ 3E — 1) i/ 0 —f/i 28 (£ 4 —4£ a + 6£ 2 —4£-f I) „ 

+ 3/256 (£*—5£ 4 -f-10£3 - 10£* + 5£-I) w 
33 ^0 + 5/2 ("x—w 0 )+15/8 (w 2 — 2w, + w 0 ) + 5/16 (c/ 3 — 3// a + 3 Wl 

—5/128 (*/ 4 -<w a + 6i/ 8 -4i/ l + i/ 0 )-f.3/2:6 (// 5 -5// 4 +10// 3 ° 

= f/ 0 (1 — 5/2+ 15/8— 5/16—5/12 — 3/256) + ^, (5/2- 
+ 15/16 + 5/32+ 15/256 )+i/ 9 (15/8 - 15/16- 5/64-15/128) 

H w 3 (5/165/32 + 15/128)-j t/ 4 ( — 5/128— 15/256) 

= (3/256) m 0 “(25/256) Wl + (75/128) z/ a + (75/ 28) W3 +j/256 ^ 

, -(25/256) 1 / 4 + (3/256) 

-3/256) (x/ 0 -| w 6 >—(25/256)(w 1 + w 4 )-f 75/.28) (w 2 +w 3 ) 

= (3a/256)-256/256) + 75c/128). by hypothesis 

= (3o/256)-(256/256) + ic + (l lr/ 128 ) 


= ir J- 3a ~25 h -\ 22c 3 (a 

— $ C + -,w — =Jc-f- 


£H_25j£-6) 

256 


2'6 

1 his completes the proof. 

Ex. 10. Assume that p, q, r and s arc the successive entries 
corresponding to equidistant arguments in a table. L-t third differen- 
ces be considered. Then the entry corresponding to the argument 
half way between the arguments oj q and r is 



294 


Mathematical Statistics 


A+14 B ■ 

where A denotes the arithmetic mean of q and r and B , of 

3q—2p—s and 3r—2s—p . 

(Agra B. Sc. 67, 65, 62, Delhi (Statistics) 68] 
For, let the interval of differencing be h. Then the equidis¬ 
tant arguments are a , a+2/i and a-\-3h. The difference table is 

as follows : 


Argument 

Entry 

dl/tf 

A*u x 

A*u 9 

-V 

V.x 


' 


a 

P 

1 

q-p 



fl+/i 

9 

r-q 

r-2q+p 

s—3r-\-3q—p 

o+2fc 


s—r 

s—2r+q 


fl+3/i 

j 





The argument half way between the arguments of q and r is 

£ (n+/i+fl+2/r)=fl+(3/i/2). 

Then the requested entry is 


u a+ih = E ’\ “a 


=[ 1 + 3/2 A +3/2• 1/2. J>/2 ! ■+ 3/2 • 1 /2 (-1 / 2 )■ J*/3 !] 

(since third differences are only considered) 

=w fl +3/2 du 0 +3/8 A*u a 1/16 J s w a 

=/>+ 3/2 ( 9 -P )+ 3 /8 (r- 2 (?+p)-l /!6 (s-3r+39-p) 

■=/> (1-3/2+3/8 + 1/16)4-9 (3/2-3/4-3/16) 

+r (3/8+3/16)—1/1^5 

= -(1/16) p+{ 9/16) 9+9/16) r—1/16) j 

=—(1/16)p+( 9 +r) ( 1 /- +1/16)—(1/16) s 

=£ ( 9 +r)+l /16 (9 + r—/>—s) — 0 ) 

Our hypothesis speaks that 

y4=arithmetic mean of 9 and r=( 9 +r )/2 
and B=arithmetic mean of 3q—2p—s and 3r—2s—p 
= i ( 39 - 2/>-j+3r- 2s-p)=3!2.(q+r-p-s ) 

Hence ^ + i£=^y- r +jj (9+'— 


The Calculus of Finite Differences 
by means of which. ( 1 ), (i) beomes 


295 


1 


This is what we wished to find. 

IntervalJ” ,0n S Gre8 °' y (F ° rWard ,0 ' CrpOlation) For ““'« for Equal 

Theorem 8. For all integral values of n y 

Pron/nt' !A) ’ y(a r +( " Cl) J/(a) + ('’ c >>^ 2 /(o) + ... + jn /(a) 
Proof. Definition of E : E> f(x)=f( x+ j h) daims tha( 

J ( a + nh) => E n fa) 

= (I + A) n f(a), since J = £—l 

-(H-(V 1 )d-f(V t )j*+...-f ^ " j An) f(a) 

=/(^)-h( n Cj) Af(a)-' r ( n c a ) J a /(a)+. 

This- proves the theorem. + A"A a ) 

Examples. 

Reform L Ne ""° n ' S Greg ° ry in,e 'P° lati ° n fo^ula can be put i„ 
whe “”~ U ° +X * U °- Xa * U ° +Xab *'“>-* ° b < ^+- -(I) 

a =l-i(AT+l), 6=I-J (*+]), c=l-i (.V+I). 

Newton’s Gregory formula provides ; ^ S ° Delh ' ' 957 ^ 

u x = /, 0 H- (* Cl ) AU 0 + (*r 2 )A*u 0 -f (%)+ (v 4 )d v, (2) 

Now /</» i——* (1 — x) 

V 2 ' 2! 2- 

I 1 —^ (A:-bl)J = —xa -..(3) 

x ( x ~ ] ) (• r ~ 2 )—( —)* *(1—x)(2 —x ) 

3 I 3~! 

= * (~2^){^J^) =X (*+U}{l-J(*+i)} 

...(4) 

( * c )_- Y — \) (x — 2 ) (x — 3 ) 
v u 4l 




—x abc 


• • 


( 3 } 




296 


Mathematical Statistics 


aDd (TiKby means of (3). (4). (5),.., assumes the form of (1). 

Ex. 2. Find the function f(x) whose first difference 

(() e x , («) (%) and (>«) sin x - 

i““ *■ 

This implies that the operation of A on ( n ) diminishes the 
lower suffix by unity. Therefore, 

-Ul)-CJ- 

and the desired function is /(*)“(„+! ) • 

(iii) Suppose that the function is 

f(x)= A sin x, 

which gives : 

sin x=d/(*)=A {sio (x+«)-s' n 

=A(-sin x-sin x) for h=n 
Then Consequently the requested function is 

/(*)=-* sin * 

“SS"STS*. «. «—*•f—-»* ~- 
Sit. ( J,«’ „< <• sr ' 

factorial function and denoted by x<">. Then 

x ( n) =x(x-h) (x-2h) . (x-tt *) " 

Letting ft=l provides : 

j (n) =x(x-l)(x-2)...(x-n+0 -" (2) 

Theorem 9. Let the interval of differencing be h. Then 

J *(") = „* x {n ~'\and in genral. A''x"=n ! ft*. 

Particularly, when h=\ dx^=px andA”x"-nl 

Proof Definition of A g> ves 

*<">- (x+/0 (n) -(-v) ( " > 



The Calculus of Finite Differences 


297 


= (x-\-h)x(x—h)...(x+h — n~\h} _ 

—x (x—h}...(x+h — n—2h) (x—n—l h) 

=x(x-h) (x-2h)...(x+h-Z=Th)[x+h) 

—(x-n-1 h s ] 

=x (x—A)rx-2/;)...(x— n —2) nh 

=»A* (n - ,) -< 3 > 

By a similar argument, 

A* x^— A[ jx^] = A[nh x ( ” 


=nh [X (W l) ]=nh (n-l)h x ( ” 2) 


=n(n-\)h*x {n 2) 

In genral, we find 

^ x ^ =n («— l)(n—2)...3.21. h° 
=n ! h n . 

Letting h= 1 in (5) leads to find 

(n) («—1) 

Jx v 7 =nx 


..-(4) 


(5) 


I* * ( v) =n (n- 


and 


(n-l)x 
(fl) 1 

A x v =n 1 


.(/ 2 - 2 ) 


This finishes the proof. 

Remark In the factorial notation , the operator A is equivalent 
to differentiation , /.e J = D w/jere 

d 


D 


(— 1 )_ 1 
Theorem 10 x — — 


dx 


x+h 


and in general. 


.(-«) 


. * ( " 2) = 


1 


(x-t-/j)(x + 2/j) 


(x+/;)(*+2/;)... (x-f/j/0 

Proof. Definition of a fa:toria! function x^ gives : 

x ( n) =(x-h)(x-2h).... x-n~=2 h)(x-n—\ ~h } 
This yields 


Letting «=*0 gives 


.(^(xW- 0 


...( 6 ) 


...(7) 



298 


Mathematical Statistics 


By convention x^=l 
Then x (_1) = 1 


x-f h * 

Putting n— — \ in (6) leads to find 

/~^=(jc+2A)j/ -2) , 
which gives : 

(- 2 )^ *-» _ 1 
X “x+2/t 

Similarly, x^~^= 7 — —r? — 7 ^ 7 -r ttm 

(x+h)j+2h (x+3A) 


and so on. 


...( 8 ) 


...(9) 


by (9) 


In general, ,< 

This proves the theorem. 

Theorem If. Ax<~ n) = —nhxl~ n ~ 1} ; 

Z |3 jt (-n) = (_)2 ,,(„+]) ^(-n-2) 

Particularly, when /t=0, Ax(~ n) = — nx<~ n ~ 1) and 
A*xt~ n ) = n{n+\) 

Proof A x<”">=(x -f-/0 (-n) - x<""> 



c _1__1_ 

(x+2/>) (x 3/j)... (x -}-n+l h ) (x+h)(x+2h)...(x+ nh) 

=_ 1 _r 1 -ii 

(x+2h){x-\-3h)...(x+nn) l*+(/i-f1) h x+hj 

—nh 

(x+h)(x+2h)...(x+nTTh) 

= -w/?x<-'>- 1 >. ...(11) 

Now A 2 x^~ n) =A \Ax(~ n) ]=A [—/i^x<“ ,, ~ 1 >] 

=— nh J.x< -n - 1 ) = — nh (—n—l) hx<~ n ~ 2 > 

*(-)*«(«+i)/i a x(-«"2> ...(12) 

Letting h=1 in (1 1 ) and (12) gives 

Ax<~ n) = — «x <-/,-1 > and J 2 x<“ n >=w(rt-{-l) *<-»•*> 

Remark. In the case of negative index for factorial function 

also, the operator A=D, where D= — • 

dx 

Illustrative Examples 

Ex. 1. Assume thot the interval of differencing is uiity. Then 
express 2x z -3x 2 -\-3x—10 and its differences in factorial notation. 

[B Sc. Agra 1^65, 55] 
In this problem, we shall use the method of synthetic division 


299 


The Calculus of Finite Differences 


(or the method of detached coefficients). 


2 

-3 

3 

-10 

0 

2 

— 1 


2 

—1 

2 


0 

4 




2 I 3 
0 / 


2 

Thus / (*)=2* 5 — 3x*-\-3x— 10 

=2*( 3 > + 3*< 2 > + 2 ;c a>-10. 

Then by usiug Theorem 8 

Af (x) = 2 3x< 2 > + 3'2xW -f 2=6x< 2 > -f- 6a: + 2 
J 2 /(x)~12*-f-6, and J a /(x)=12 
Ex. 2. Find the function whose first difference is 

x 3 -f-4x 2 +9x-|-12. 

By hypothesis, 

A f(x)=x 3 -f Ax'- + 9x +12 
The method of detached coefficients gives : 



Thus A /(x)=x< 3 >-f 7xC 2 > + 14x<» + 12. 
Then/(x)=J[x< a >-f7x< 2 >-fl 4x(i>+12] dx 

x* 4 * , 7x i3> , 14x< 2 ’ , , 

= -^- + -3- + —+ I2X+A 


X< 4 > 

4 



7 ^( 3 ) 

3 


+ 7x< 2 >+12x+A, 


where A is assumed to be an arbitrary constant. 

Ex. 3. Find the relation besween «, 0 owe/ y //> cr</er that 
a + px+yx* may be expressible in one term in the factorial notation. 

(B Sc Agra 1963, 611 

To find the relation between a, 0 and y. assume that 
/(x) = a-f0x-f yx 2 = (a-f bx)&>. 

The definition that xt*»=x(x— 1), when the interval of differ¬ 
encing is unity, implies that 

a-f (3x-\-yx 2 =(a-\-1 x)[a+b (x-1)) 



300 


Mathematical Statistics 


= (a 4- bx ) (a—b -f bx) 

= ( fl2 -^)+(2a6-h 2 ) x+b*x 2 . 

Equating the coefficients of various powers of x leads to find 
a - a- —ab, fi= 'ab — b 2 , y = b 2 

Now p*=(2 ab-b 2 y=b 2 (2a-b) 2 
=b 2 (4a 2 —4a6+6 2 ) 

=6 2 {4 (, a*-ob)-\-b 2 } 

=y (4a+y) = 4a74-y 2 

Therefore p 2 =4«y-fy 2 is the requested relation. 

18 9 Differences of Zero 


Definition The quantities A n x m | * =0 . generally written as 
A n c m t bear the name of differences of zero in view of the fact that 
the leading term is always zero. Note that n and m are both 
positive integers. 

Theorem 12. Assume n and m are both positive integers. Then 



Proof. Evidently, J"x m =(£—l) fl x m 

= [ £*-("} £”-' + (") 1)“] 

= (*+/!)”-('') (x+B-l)”+^) (x+n-2)” 
Putting x=0 lea^s to find 

| 1 | (o—!)”+( ” y(n-2) M —... 


This proves the theorem. 

Theorem 13. J n o m —n (J n_1 o'”” 1 -}- J n o m_1 ), ••(2) 

where n and m are assumed to be positive integers. 

Proof The preceding theorem gives : 


J *o m = n 



»-( " ) (/!-!)”+( 2 ) (n—2)"'—... 


1_ 



The Calculus of Finite Differences 


301 




in-1 


=n £ x™- 1 — j 1 j £»-■ 


) (l+rt-3)'"-*-...] 


•2 Y m “ 1 


+ (%') 


= « j^(£:_l)n-l x m-iJ 

=n A n ~ l \ m -i 

=n A n ~' Eo m ~ 1 =n A n ~' (\-\-A) o'"" 1 

= rt (^ n - 1 + 0 /n-l) 

Remark. Evidently 

Jn o m n 0 /7i-l Jn-1 0 »*-l 

« 1 + ! ’ 

This is a recurrence relation, by virtue of which, we may 
calculate A n o m when J»-» o’ 71 -! and A n o'*-' are known. 

Theorem 14. Assume f (x) is a polynomial in x of degree m. 

m vU) 

Then x m = Z —— J/ o m 

y=o / ! 


Proof. As a matter of fact, 

/ ( x)=E *fto )=(1 + A )* /(o) 

=/(o) + (*) ^/(o) + ( 2 r ) ^*/(o)+... + (* ) J-/[o). 

Letting f(x)=x m yields : 

Ao m +j d*V" 


^l m o m 


= 2 (*)»*•- 2 ~ A> O" 

y-o \ y / y-o J 1 

This finishes the proof. 

Theorem 15. Z.e/ n be a positive integer. Then 

A n o n =n !, d n+ -/ o«=0 
*• e - A m o n = 0, «<m 

Proof. From the preceding theorem, we find 
m ™ *<y) 

* m -= z 4 V- 

y=o ^ 1 


A> o’". 



302 


Mathematical Statistics 


n vO) 

Then *"= Z Yr A ' 

7=0 J * 

Identification of coefficients of for r—», r>n on both sides 

gives, o n =n ! 

and A n+ i o" =0, i. e. 0 for n<m. 

This proof is complete. 

Exercises 

1. Define A and E, and then establish : 

(a) E=l+A, (b) E n f(a)=f(a+r,h ), 

(c) E*=(I+J)“ and (d) EA=AE 

[Meerut B. Sc. 69] 

2. E and A operations of interpolation theory obeys the 

distributive, commutative and index laws of algebra. 

[Delhi B. Sc. (Hons) 67, 66] 

3. Evaluate (a) A 2 (3e x ) and (b) A 2 (cos 2.x) 

[Meerut B. Sc. 68; Lucknow B. Sc. 67] 

[Ans. (a) 3(e*-l) 2 e*, (b) -4 sin 2 h cos (2*+2/i) 

4 . (a) Let u u =e ax + b . Then A n w a ,=e flJt+ * (e fl —l) n 

[Bangalore B. Sc, 69] 

_ r ( S.v+12 \__ 2 _L__ 

K f \.v 2 + 5.v-f-6/ (.v-t-2) (.x+3) (.v+JX.vfi) 

where the interval of d'fferencing is unity. 

(c) E 2 x 2 =,Y“-f- 8.x 4-16 when the ihterval of differencing is 2. 

5. Define the operator A , E and D used in the calculus of 
finite differences. Then 

(a) E=\+A=e° 

IA 2 \ Ee x 

(h) (e) w h «eA=l 

[Mysore B. Sc. 68; Bangalore B. Sc. 69; Delhi B. Sc. 69] 

(c) Distinguish between 




and 


A 2 u, 

EUl 


and compute the values of these functions when u x =x a . 

[Agra B. Sc. 58; Delhi B. Sc. (Hons) 68, 67; 
Bangalore 69; Lucknow 62; Patna 61] 

6. (a) Let h denote the interval of differencing. Then 

(a) A sin .y= 2 sin /i/2-cos (.x+/f/2) 

(b) A cos a = —2 sin /i/2-sin (x+h/2) 

7. Define the functions .x^ m) and rf~ m) and find their mth 


The Calculus of Finite Differences 


[Aos 


10 . 


11 . 


303 

equal 6 ,"oor dis,ingui * hin 8 betweer > ‘he cases when n is less than, 
equal to or greater than m. 

8. Show that x<">=n ! h" and J«+i x< n >=0. 

9. Represent the function 
f (x)=x*—l2x*+ :4.y 2 — 30*-f 9 

and its successive differences in factorial notation. 

... , ... . f Agra B * Sc - i964 > Punjab B. A. 1956] 

Ar«>—6.Y< a »—5 a^ 2) —17.v (1 >-f 9; 

A /(.v) = 4.v< 8) ——1 C.y< ij — 17 ; 

A 7 f (*)—12*< 2 >-—36*< 1 >—10, A 3 / (.r)=24 jcO)—36 

^ 4 /(a)=24, A*f(x)= 0] 

Show that the function, whose first difference is 
9**+ll*+5, is 3.x a + A- 2 +.Y+A. 

( a > Express x*—3x+l in factorials. Hence find its third 
difference. 

(b) Express w r =A 3 = 3A 2 +5A +12 

in the factorial notation. Hence find the value of Au x . 

12 p; n ^ .• f „ (Sardar Patei B. Sc. 68] 

mz. i'ind the function u x for which : 

(a) Au x =„, (b ) Aur = 7u, ( c , j, Ur=9ux and Au , = x t 

<•> <«—i-'+i-i 

=0 , if m> , 

f" xn'r* n(n+ ' ] An °" = l »(n+l) ! [Delhi M. A 551 
('i+l) A w o n =2 (A n ~'o n f A n o n ) ^ 

Prove that tDe ' hi M ' A ‘ 6h P " ,Da 57 I 

(i) VE=Ev=A, where V u,^u,-u,. u fir r - B , +1 , and 

Wsr -• Wr + i — V* 

(ii) 

Interpolation. 

Definition. Interpolation is infact the method of estimating 
an unltnown value with the help of a given set of obser" 

jm a “"tSirtaU 1 Z are provided with ,he values of a funcl,on 

re rhl, } arguments). Then interpolation defines the 

of ,he ' eS ' ima,e ^ ^ «*>- ermediate Sue 

°J the argument. 

In the words of Theile, interpolation is nothing but the art of 
reading between the lines of a table. 


(a) 

(b) 

(c) 

14 . 


1810 . 


304 


Mathematical Statistics 


The value of/(x) thus found is called as interpolated value. 

let the entries corresponding to arguments x=a, a+h, 
fl U be/(u),/(«%>./ (a+nh). Then the technique 

f+tofcrtof finding/im) for x-m. which m lies m the range a 
and a+nh defines interpolation, whereas the technique to estimate 
f(x) for any value of argument x outside the given range of argu 

ments is termed as extrapolation. 

Examples. (1) Assume we are given the censusfigtuesfor 
the population of India for years 1931, 41 151, 61 and? 1 Then 
the estimate of the population figure for the year 1959 

falls in the domain of interpolation whereas thecstimateoft 

population figure for the year 1977 falls in the domain of extra 
polation. 

(2) Let us consider the following table : 

Argument x 0 5 .'o 23 24 

In this problem, under the method for estimating/(16) defines 
interpolation whereas that for estimating /(30) defines extra- 

polation. 

Assumptions. 

1 The values of the function (or the entries) should be 

either in an increasing order or in a decreasing order. This 
implies that there exist no sudden jumps or falls in the values. ol 
the entries for the period under considerat on. Equivalently, th 
given data do not subject to abnormal periods such as periods oi 
famines, wars, epidemics, etc. which may result in sudden changes 
in the values of/ (x). Mathematically, the data are capable 
being represented by a smooth or a continuous curve which claims 
that the data can be expressed by a polynomial of a certain 
degree. The following theorem helps us in determining such a 

polynomial. 

One and only one polynomial curve oj degree less than or equal 
to n goes through a given set of {n + 1) distinct pO'tHs. 

The reader must bear in mind that all the formulae of inter¬ 
polation are based upon the fundamental assumptions that the 
daia are expressible in a polynomial form with fair degree o 
accuracy. 

2. The rise and fall in the values of the data, should be 
uniform. 


The Calculus of Finite Differences 


305 


3. There must exist some sort of relationship between the 
two variables. 

Methods of Interpolation 

There are three methods of interpolation, which are as 
follows : 

(i) the method of graph, 

(ii) the method of curve fitting 

(iii) the method of using various formulae of calculus of 
finite differences. 

But we confine our attention to the study of the third method 
only. 

By theorem 8, for each n, 

f (a+nh)=f(a) + n Cl v/(n)+"c 2 y 2 f(o)+...+ A n f(a). ...(1) 

Letting a= 0, h= 1 n=x in (I) leads to find 

/(*)=/(OJ+'c, d/(0)+’c 2 j* /(+ A»f(0) 

+ ...+ J*/(0) ...(2) 

Assume a: takes the values 0, 1, 2.#i only. Then (2) 

assumes the form 

/(*)=/ (0)+*^ J/(0)+*c 2 d 2 / (0)-f. A n f (0)...(3) 

(3) is called as the Gregory-Newton's advancing difference 
formula. 

Gregory-Newton’s Backward Interpolation Formula. 

Theorem 16. 


/ (o+xh)^f(a)+x? f (a)-f —p 2 f(a) 


+ 


* (x+1) (x-f 2) 
3 ! 


F 3 f(a)+... 


. x (x-fl) (x+2)...(x+n—H . . 

+--F"/(a> 


• * 


.(4) 


We call (4) Gregory Newton's Backward difference formula 

Difinition ofA says that 

Proof, f (a \-xh)=E* f (a)=(! — ?,-• f ( a ) 

-f{a)+xVf(a) 



X (x-f-n (x+2 ) 
3 ! 


?•/(«> + -.(5) 


where the last term depends upon the degree of the poly¬ 
nomial f(a-\-xh) If f(a^-xh) is assumed to te a polynomial 
of nth degree, then A* u = 0 for each x > n. 

Then the series (5) terminates after (n-fl) term, and hence 
(4) is established. 



306 


Mathematical Statistics 


Missing Terms (at eqoa) intervals). Assume we are given a 
set of equidistant terms with one or two more terms missing. 
The problem of estimating such terms may be easily handled oy 
using the operators E and A. 

The following table indicates that we are given (n+1) equi¬ 
distant arguments. x=0, 1, 2 ,..., n, say but the entry uj corres¬ 
ponding to any one of them is not given. Then we wish to esti¬ 
mate Uj. In view of n entries being given, the data can be repre¬ 
sented by a polynomial of (n— l)th degree. Hence we are justi¬ 
fied to assume that u x is a polynomial of (n—l)th degree. 

Argument : 0, 1, 2, 3j, ..., n 

• t/ 0 , U\y Uf. U ), U n 

Then constant 

and zJ n w r =0, x=0, 1, 2,... 

Particularly, J"w 0 =0 

which implies : (£— l) n w 0 =0 

i e. [E n — n C\ £«-i+"c 2 £ n— -—...(— 1)"] m o =0 
i e. u n — n c x u n _ 2 l) rt i<o=0 

(£— 1)" u 0 =0. 

This is the requested equation, by means of which the 

• • 

missing entry can be easily computed. 

Now assume that two entries u; and u k are missing in a set 
of (rH-2) equidistant arguments. Then 

JX=0, .y=0, 1, 2,... => 

dX=0 => (£- l) n m 0 =0, (£-1)” !/j = 0 

I hese provide : 

u n - n c x -f n c., w n _ 2 -}-...(—1)" t/ o =0 
and Mfl+l — n C x U n + n C t U n _,+ ...+(—l) n M,=0 

The two missing terms can be estimated by solving these 
two equations. 

A similar argument leads to estimate the three missng terms 
by solving the equations : 

J"w 0 =0, = 0 and J*b,=0 

Illustrative Examples. 

Kx. 1. The numbers of members of a certain society are as 
given in the following table : 

Year: 1910, 1911, 19i2, 1913. 1914, 1615, 1915, 1917, 1913, 1919 
Numbers u x : 824, 867, -, 846; 821, 772. 757, 76], 796 


The Calculus of Finite Differences 


307 


Find the best estemate of the members in 1912 and 1916. Also 
compute w 4 , given 

w 0 + w 8 =l*9243 w 4 -fw 7 = 19590, 

1*9823, w 3 4-w 5 = 1 9956 

[B. Sc. (Hons ) Delhi 66] 
The above table indicates that we have 8 entries at hand. 
Then u x can be represented by a seventh degree polynomial, and 
hence 

A 1 w*=constant and A 9 u x -»0 
Setting 1910 as origin implies : 

Wo'845, =876, w 2 =?, w 3 =846. w 4 =821. 

w 6 = 722, w 8 =?, w 7 =757, w 8 =761, w 9 = 796. 

Particularly, 4 8 w *-=0 gives 

zJ 8 w o =0, A*u x = 0, 

from which we find 


(£-l) 8 w 0 , (£—l) 8 Wj=0. 

Now (£— 8> 8 u 0 gives : 

[£8 _ S Ci £7 + S Cs £ 8 _ 8 Ca £6 + 8 ^ £«_ S C6 £3 

+ V 6 £ a — 8 c 7 £+ 1] w 0 =0, 

/ e. (£ 8 —8 £ 7 +28 £ 6 —56 £ 5 +70 £ 4 —56 £ 3 + 28 £- 

-8£+l) w 0 =0 

i.e. (w 8 -f w 0 )-8 (w 7 +m x )+28 (M# + M a )—26 (u 6 + w 3 , + 70 u t =0 

...(*) 

i.e. 28 (w 2 -f m 6 )=44524 ...(2) 

From A*Uj = 0, we have 

(«o+w,)-8 (w 8 -f// 2 )-t-28 (w 7 + w 3 )-56 (w«+w 4 )-F70 w 5 = 0 
Then 8w a -f 56 w a = 48523 ...(3) 

On solving 2) and (3), we find 

„ 2 = 844 2708, w fl = 745'8120 


To compute u 4 , we shall utilize (1). Then (1) yields : 

1-924J —8 x 1 95S0 + 28 x 1*9823— 56 x 1-9956 + 7U w 4 = 0, 
which gives : 


69-9969 
70— 


=0-9999557. 


This is what we wished to find. 

Ex. 2. If l x represents the number of persons living at age x 
in a hfe table , find as accurately as the data permit , l x for values 
°f x=35, 42 and 47 from the following data : 

4o = 512, / 30 = 439, / 4 o=346, / 60 = 243. 



308 


Mathematical Statistics 


Our hypothesis provides the foilwing table : 

Age x ; 20, 30, 40, 50 

Number of persons : 5i2, 439, 346, 243 

The following table is constructed from the above one : 


X 

/. 

Air 

A'U 

A'lx 

20 

512 

439-512=73 



30 

439 

346—439=—93 

—20 

10 

40 

346 

243—34t>= —103 

-10 


50 

243 





Gregory Newton’s forward interpolation formula 

for equal 


intervals leads to find 

f(a+xh)=f(a)+* Cl J/(a)+*c, Zl»/(a)+* f » ^’/(«) 


Inserting a= 20 and /j=10, t*e have 

Uo~\‘ tC C l J/g0"l“*^2 ^^20+ **8 A' lto 


Uo+lO* — 


= 512+*. (-73)+ 


x(x-l) 


. (-20)+ 


x(x-l)(n-2) 

3 ! 


.( 10 ) 


Let*=l'5. Then 


/ 3 .=512+1*5 (-73)+ - . 5 (l 2 ’ 5 l ) (-20) 

. 1*5 (1*5 — 1) (I 5 
+ 6 

512—109-5—7*5—*625=394*375 




.10 


~394 

Let 10.v=22. Then x=2*2 and hence 

/„=512+2-2 (-73)+ 2 ‘ 2 - (^ 2 - . 1 > (—20) 

+ 2-2(2?-l)Q 2^2j_,, 0) 

= 512 —160 6—26*44-0 88 
= 512— I87+U*88 
=325*88^326 

Let 10x=27. Then x=2 7 and hence 
/ 47 =512 + 2-7 (-73)+ —\ f1 7) (-20) 


(2*7) (1-7) (0*7), 


• 10 ) 


= 512-197*1-45-90-1-5*355 


The Calculus of Finite Differences 


309 


= 512-2*43 + 5-355 

=274*355 

^274 

Hence the number of persons living at age 35, 42 and 47 is 
394, 326 and 274 respectively. 

Ex. 3. Compute sin 52° from the following data : 
sin 45°=0*707I, sin 5^ ° = 0*7660, 
sin 55°=0*8192, sin 60°=0*8660. 

The following is the table of finite differences : 



45 


50 


55 


60 


f fx) = sin x 


0*7071 


A fix) 


**/<*) 


A 3 f(x) 


0*7660 

0*8192 

0*8660 


0 0589 
0*0532 
*0468 


-0*0057 

—0*0064 


—0*0007 


By Gregory-Newton’s formula, we find, 

f(a + xh)=f (a)+*c l A f (a)+ x c% J3/(a)+*c 3 A 3 f( a ) 
Letting a=45 gives f (a)= 0 7071 and 
a+x/i = 52 provides; 5x=-*2 — 45 
= 7, giving x= l *4. 

Then /(52) = 0*707 1 + 1 4 (0 0589) 


+ 


<iM (-0-0057 )+ (H*W) (( ,ooo7> 


=0*7071+0*08246-0 001596-0*0000392-0 7880032 

= 0*7880032. 

This implies that sin 52°=0 7880032. 

Exercises 

!• Define the terms interpolation and extrapolation. Esta¬ 
blish Newton’s interpolation formula for equidistant intervals, viz. 
w «+ n =«*+%/ 1 w *+ nt » ^ ^+»+° r „ 

where n is a positive integer. (Delhi B.Sc. 59) 

2. Establish Newton’s forward difference interpolation 
formula : 


€4 K *=u 0 +x Au 0 + ^-~ r - ] -A 2 t/ 0 + ... + -.(r y -~ 1) * ^~i > ±lj 

^ • n ! 


A x u 


0 


310 


Mathematical Statistics 


(a) Discuss the suitability of the above formula 

(b) Is the above formula a good one in case u x =sm vx 1 

(c) Let u x be a polynomial of degree not exceeding 5. men 
how many terms in the right hand expression will give you an 

exact fit ? ( Nagar ® SC ' ] 

3. From the following table of yearly premiums for policies 

maturing at quinquennial ages, estimate the premiums for po i 


60 

1-862 


maturing at the ages 46 years. 

Age x • 40 50 55 

Premium u„ : 2-871 2-404 2-083 

[Ans. 2-753 approximately] 

4. The population of a country is a given below, 

the population for the year 1925. 

Year /: 1891 1901 1911 1921 

Population (000) u t : 46 66 81 93 


65 

1-712 

Estimate 

1913 

101 


[Ans. 96*8368] 

5 Establish Newton’s backward formula : 

*(x+l) . , .xf.v+p-d* ±£rl> 

Wx =no+xpw 0 -f—2“j— P 7 w °+ — + n i 


P n Wo 


for x=0, —1, —2, —n. 

6. From the following table, find the number of students 
obtaining marks less than 45. 

Marks : 30-40 40-50 50-60 60—70 70-80 

No of Students : 31 42 51 35 31 

[Ans. 48] 


7. Complete the following table : 

* ; 0 0 1 0 2 03 04 0*5 06 

u x : 0-135 * 0-111 0-10 * 0 082 0 074 

[Ans. w 0 -i= 0 ‘123, u o . 4 =O‘09O] 

8. Show that w 2 . 26 =97519-4453, given that 

w 0 =982Q3, u l= 97843, u 2 -975 ;9, u 3 =97034 

9. Let k 0 =580, w 1 =556, w 2 =520 and « 4 =385. 

Then i/ 3 =465. 


10. Find the estimate of the missing figures in the following 


table : 
x: 

y : 


2-0 2-1 2-2 2-3 2-4 7*5 2*6 

0*135 * 0-111 0-100 * 0082 0074 

vI.A.S.; Punjab B.A. 1956) 


[Ans. j- 2 .i= 0 ' 123 * j 2 .4 : “0 , 090] 



The Calculus of Finite Differences 



11. (a) Find the missing term in the following table : 
x: 0 1 2 3 4 

y: l 3 9 * 81 

Explain why the resulting value differs from 3 s or 27. 

(I.A.S. 1952) 

(b) Find the missing term in the following table : 

* : 16 18 20 22 24 26 

y : 39 85 * 151 264 3.8 

[4n S . (a) C-31, (b) 95 9] 

Divided Difference 
J# II. Definition : 


We define a divided difference as the difference between two 
corresponding values of the arguments. For clarity of ideas, let 
the values of the function y=f(x) corresponding to the arguments 


x = a , b , c 

be /(*)=/(*) ,f(b) ,/(c) 

Then the first divided difference 


D 91 19 11 •• 

Notations for Divided Differences. 


1 “ 9 ^ ••• 

*Ad) ,/(*) 

at x—a is ^j. 1 

„ *=4 

c—b 


There exist various types of notations for divided ^differences 
used by different authors. In this book we shall use the symbol 
for d/'vided difference. 

The following table tells us how to construct a divided differ- 


encc table. 


Table 


(I) 

(2) 

(3) (4) 

(5) 

Argument 

Entry 


X 

/(*) 

A Ax) Ai 2 f(x) 


a 

/(<*' 



f(h\—j\ a ) 

= /La /(a) 



o — a 


b f(b) 


Al e f(h)—/h.bf(a) 
c — a 




= f{a) 



r(c)-m 

= &c Ab) 

/tw 2 f<b)-A. h *f(n) 


c—b 

d — a 




= &oc d 3 f{a) 



312 


Mathematical Statistics 


c /(c) 


f(d)-f(c) 


Adf(c)—Aef(b) 

d-b 

=AefAb) 


d—c 


=Aaf(c) 


Ad*fc)— Acdftf) 
e—b 

a, 


d Ad) 


A,Ad)—&iA c ) 

e—c 
=A*?f{c) 
f(e)-Ad) r/JS 

~ e-d — 

Remark. A a fyp) is the divided difference of the function 
f(x) at the point x*=a when the entry at x—b is taken into consi¬ 
deration. 

2. 2 /(fl) is the divided difference between the first order 
differences between x=a and x—b. 

3. From the above table it is clear that there is a correspon¬ 
ding increase in the total number of suffixes in the operator and 
operand as the order of the differences increases. For example, 
there are only two suffixes a and b , in A fl o*/(«) there are three 
suffixes o t b and c aod so on. 

Simple device to write down a divided difference. 

From the definition, 

a ^ m-m 

A.A“)= - b _ a 

a—b b—a 


Similarly, 

zL„„ 2 /(«)-A. [A«./(a)]=A. [^+^j] 


r/w+jwi \m,m\ 

_\a—b__b—a\ | c-b b-c_ 


f(a) 


a—c 


+ 


+ 

fib) 


c—a 


+ 


Ac) 


(a—b) (a—c) (b—a) (b - c) (c—a) (c—b) 
Now Aba*/(<*)*= Ad (Abe 2 f(a)) 

fib) . f( c ) 


(a 


1 T /(*) 

-d) L (a-b) (a— 


+ 


+ 




c) (b—a) (b —c) (c— a) (c 

+jL_r_ A d) + Ab) , ~f( c ) i 

(d-a) L (d-b) (d-c) (b—d)(b— c y(c-d)(c-b )J 



The Calculus of Finite Differences 

f(a) 


313 


+ 


m 


(a-b) ( a—c ) (a-d) (b-a) (,b-c) (b-d) 

+_ M. _ + _ gd) _ 

(c-a) (c-b) (c-d) (d-a) (d-b) (d-c) 

and so on. The higher order divided differences in terms of the 
values of the entry can be written down. 

Illustrative Examples 

Ex. 1. Form a divided difference table for the following data 

x 1 2 4 7 12 

f( x ) 22 30 82 106 216 


The divided difference table is as follows : 


[Punjab B. A. 1956] 


X fix) 

1 22 

& f(x) 

& 2 f(x) Ai 3 f(x) 


30-22 _ 
2-1 ~ 8 


2 30 

82-30 

4-2 1=26 

26-8 _ 

4-1 " 6 

_ i _(-3*6)-r6l 

7-1 

<N 

00 


= -1*6 

8 ~ 26 * * 

-7 ~ — 3*6 


A 4 (x) 


106-82 

7-4 


8 


7 106 


216—106 


(l*75)-(-3*6) 
12-2 
= 0*535 


0*535 —(— 1M 
12-1 
= 0 194 


22 


22-8 


1*75 


12-7 ~~ 12-4 

12 216 

Ex. 2. Compute & b f(a) t & bc 2 f(a) and & bcd 9 f(a ) for the 
function f{x ). 


By definition, 

W(fl)= /w=/w = m+m 

b — a a — h h—n 


b — a 

(1 /a 2 ) ■ (Mb*) 

a—b b—a 
_a+b 


a—b b—a 
— (q 2 — /> 2 ) 
(a-b) a *b* 



314 


Mathematical Statistics 


By definition, 


a+b\ 


Abe 2 f{p)~ Ac (Abfip ))— Ac | J 


_ _ 1 p+c _ a-j-fr ] 

c— a|c*£* n 2 Z > 2 J 

1 a 2 (ft-M-c* 

a- b 2 c* 

ac (a — c)-\-b (a 2 —c 2 ) 
a 2 b s c* 

(a—c) ( ac+ab+bc ) 
a*b 2 c* 


c—a 

1 

c—a 

1 


c—a 
ab-\-bc+ca 
a z b*c 2 

Again definition of Abed f(a)* 


Aw/( fl ) = ^<! {Abe f(a))—Ad ^ j 

r<M+6c+ca/, Jt , db+bc+cd\ ir1 

=—- 1 , [-<i 2 (0&+6c+C0)-f 0 2 (a&+fc-K</)] 

(£/— a) a £ b £ c*a z 

The numerator can be rewritten as 

a 2 be+a 2 cd-\-a 2 db-d 2 ab-d 2 bc— d 2 ca 

=bc (a 2 — d 2 )-\-acd ( a—d)-\-abd ( a—d ) 

*=(a-d) (abc + abd+acd+bcd), 

in view of which, 

n x abc.-\-abd-\-acd-\-bcd 
Abed J ( a ) = a 2 b 2 c 2 d 2 

Ex. 3. FiW Me divided differences of the various orders for 

the function y=x 2 w/ien x=a, 6, c,... 

By definition. 


Aa/(*)= 


f (b)—f( a) 


b—a 


which mav rewritten as 


. „ , f(rt-fib) /(*) J(b) 
AbJ(o) „ __h a—b b—a 


a 2 


a — b 
j. **__ 


(a'—b°-)=a+b 


Then A,f(a)=— b . a _ b 

By definition, A 2 bc f{d) =Ac [A6/(fl)]=Ac (o-\-b) 



The Calculus of Finite Defferences 


315 


_ Q-\-b ^ c —f ~b a-\-b — c—b 

a—c c—a a—c 

Ex. 4. Establish : (i) x*=x-\-y+z, (//) ±hc (l/a) = l/abc, 
WO A 3 bed (1 /a) = 1 1 abed 
For, A\ z x 9 => Az (zL v -v 3 ) 


Now 


- A ,(/L+jf_\ 

\x-y y-x ) 
lx 3 — v 3 \ 

(-—)=&.* (* 2 +xy+y 2 ) 


_ X*+xv-\ -v* _j_ z 2 + r>’-fv 8 __ (.v 2 — z 2 ) -f- v(x — z) 

X — 2 Z — X X — Z 

=x+y+z. proving (i) 

& 2 bc (l/^)=zlic [Ab (1/tf)] 

1 [a 

-b 

1 b-a l 1 \ — 1 lab , -1 leb 




a— A' ab 

1 


\ ab J a—c c —a 


1 


! 


ab (a — c) 1 be (a—c) b (a — c) 


( \/c—\/a ) 


(a — c) = 


1 


abc(a-c) 

(iii) Now zlV a (l/a) = zL<y [d 2 *c (1/tf)] 

__ i/n6c , 1 /dbc 
a—d 1 -a 


completing the proof of ;ii) 


_ 1 f I _ J 1_I ( d-a 1_ 1 

a—d\abc bed J d—a \ubcd J abed 

Properties of Devided Differences. 

Theorem 17. Divided differences are symmetric functions if 
their arguments , / e. divided differences are independent of the 
ord*.r of arguments. 

Proof. Let a, b, c be three suffixes of the operator and oper¬ 
and of a second order divided difference. Then we show that 


Ai s cbf(a) = /h 2 c U f(b) = zL 2 be f(c) 

and this result holds good for anv number of suffixes. 

By definition, A\bf(a)- 

fa)—f (h\ r 

- a _fl — =&.hb) 

Similarly, 

' A*cbf\a)--=Asb [/Lc.Ao)J = zLt [Aof(c))=Af* ub f(c) 



316 


Mathematical Statistics 


Mathematical induction proves the theorem , 

Theorem 2. The nth divided difference of a polynomial of degree 
n is constant. [Lucknow B. Sc. 67] 

Proof I8t suffices to consider the function fyx)—x n . Let the 
(w-fl) values of the function fyx) corresponding to the arguments 
be 


Then 


x...a h c...I 

f{X)...a n b n c n ...l n 

b n —a n 


A\b<* n = 


b—a 


=6 n-1 +b n ~ 2 .a -\-... +a 


n— 1 


which is a symmetric function of degree (n— 1) in a and b. 

Note, zL&a n =the coefficient of .x n_1 in the expansion of 

(1 -f bx + b 2 x* 4- ...)(1 + ax+a 2 x *+...) 

=>the coefficient of at'* -1 in the expansion of 
1 _ 1 __ 

1 —bx ' 1 — bx 

— the coefficient of [(1 —ax) (1 — &a)] - 1 
Similarly zh6" = the coefficient of .v"- 1 in the expansion of 

[(1— bx) (1—c.V) ]~ l 
and so on for A<ic n , ... Aek n . 

Now by definition, A 2 bca n = -~^-^Acb n —AbO n J 


=—f 

c—al 


coeff. of A' n 1 in 


1 


(1— ca)(i — bx) 

— coeff. of x n ~ 1 in ———4--, 1 

(1 bx){ I ox)} 


= _1_coeff of v n_1 in ---—- I 

c-a l(1 cx) (l bx) {i-bx)(i-ax)\ 


= the coeff. of a"- 1 in the expansion of £ 
= the coeff. of v" -2 in the expansion of 


1 


(I-ca)(1-6a)(i-oa) 


1(1- c.v)( l - bx) (1 — fl.x)]- 1 


Similarly, 

&, 2 c jb n —the coeff. of*" -1 in the expansion of 

[(\-dx)(\-cx){l-bx)]-' 

and so on we can find A 2 d.C\ AV^V-- 
Mathematical induction leads to generalise that 

A"'a n will be the coefficient of .v" - ”* in the expansion of 




The Calculus of Finite Diffrences 


317 


(1— ax).{\ — bx).([—cx) _1 ...(a product of m +1 factors) 
3nd hence As n bc-»-k o n is the coefficient of .x° in the expansion of 
tO— ox)(l—bx)...(l—kx)(l—/x)~ 1 which is unity, 

i.e. A\ n x n =l. 

' In general, A n ( cx n )=c& n x n =c 

This proves that the nth divided difference of a polyno¬ 
mial of degre n is constant. 

18*12 Relation between an ordinary difference and a divided difference^ 
Let the values of the function }'=f(x), corresponding to 
x=a, b, c, d and e 
be /(*). /(c), f(d) and /(c) 

Now if a, by c, ... are equally spaced, thtn ordinary differences 
are 

Afa)=f(b)—fya) 

A*f(a)=Af(b)-jf(o) 

A 3 f(a) = A 2 f(b)-A 2 f(a) 

A*f(a)=A 3 f(b)-A 3 f(x) 
an divided differences are 

. ^ f(b)-f(a)_Af(a) . r t 

/i\b J\P) —— -- - —...jf b — a =c — b=d —c=c— d=h 

o—a n 


say. 


AV/(a) = = Xlh Af(b)-Mh Af(a ) 


c—a 

_ Af(h)-Ma) _ A* f(a) 
2h- ~ 2h 3 


„ 3 r<h)-/hhcf<a) _A 2 f(b)/2h—A 2 f(a)!2h 

2h wKa; d _ a ~~JJT 

_ A 2 f(b)~ A 2 fa) _A 3 f(a) 


Similarly* /LVi»/(fl) = 


h*. 1.2.3 3!/P 

A*f(c) 


4 ! //« 


In general, we find 

A\ J bcde • • • f f( a ) = j y , /L y /(a), S = I, 2,3, 
which implies 

cfij » J = *» 2 » 3 » 

This is the requested relation between A and 
18* 13.Newton’s divided difference formula for interpolation. 

(Delhi B Sc. (S) 67, Agra B. Sc. 62) 



318 


Mathematical Statistics 


Let the entries corresponding to the fu.iction.ys/fr) 
x= a , b, c, d y k , / 

be f(a), f(b), /(c), /(<#)../«./(/) 

Then we can fit a polynomial of nth degree in view 
entries. 

By the definition. 


Then 


f(x)=f(a)H-(x—a) A* f (a) 

A v/(«)— 


Now A 2 xf(a)= x _ b 

i.e. A 2 xf(a)=/hbf a)+(x-b) Ahxf(a ) 

On insertion of (2) in (1), we find 

/(*)=/(a)+fr-a) A6**/(0 j> 

= f(a)-f(x—a) f(a}+(x—a)(x — b) zLV f(a) 


of /i-fl 


...( 2 ) 

...(3) 


Now AV,/(x) ^ V ^ ^- a) 

^ C j 

Then A 2 ** /(o)=zL 2 &c /tfl}+(x - c)AV*/(a) 

Putting this value in (3) leads to obtain 

/<*)=/(a)+(* - o) Abf a)+{x—a'{x—b) 

<A*bcf(a)+(x-c) A\cxf(a)} 
=f(a)+(x—a) Abf(a)-\-{x—a){x-b) A z bcf(.a) 

-f (x—a)(x—b) (x—c) AJbcxf(a) ...(4) 

Similarly substitution the values of A 3 bcxf(a) i A*bcdx f(o) t 
A n ~ l bc.kxf(a) gives : 

J(x)~f(a)+(x—o) AbAx)+(x-a)(x-b) AV/(a' + ••• 

— a) (x—b)...(x—k) A n bc-—kif(x) 
+(x—a)(x—b\x—c)...(x—l) /K n+l bc-"ixf(a ) ...(5) 

Define P{x> and R (jc) : 

P(x) =/(o) + (x-o) Af{a)+... 

■f — a)...(x — k ) A n bcc'*’xtf{d) 
and R(x)=(x — a)...(x-l) A n+1 b c-»ixf a) 

Then (5) gives : f(x) = P(x)-f-R(x) ...(6) 

Infact P K x) is a polynomial of degree n in x since the term 
containing A n be-»ifsd) contains n factors of x , viz, 

(x— o), (x—b), .. (x—k) 

in its coefficient and in the expression of R x) there are two parts 
(i) (x—a)...(x~ /), which is a product of (/?+l) factors is a poly¬ 
nomial of (n-}-1) ,A order i. e. A nfl be--ixf(o) which contains the 
variable x in its suffixes. 



The Calculus of Finite Differences 


319 


The term R(x) vanishes for x — a t x=b, ... x=l and represents 
the given set of entries. But utmost a polynomial of n ifi degree can 
be fitted to given n ~i entries. Hence /k n+1 bc »irf(x) vanishes 
i e. /?(*) should be zero. 

/(■*)= F(x) = f(a)+ v x—a) ^ b f(a)-h(x — a}(x — b)AV f(a) + ... 

-f(x-a)...(x-k) /h n bc . i f(a) ...(7) 
which oears the name of Newton’s divided difference formula for 
interpolation. 

18*14 Newton’s Greygory formula for equal intervals as a particular 
case of Newtons’s divided difference formula. Let the values of the 
argument be given at equal intervals of width h Then, b = a-\-h , 
c=*a-{-2h.... Let x=a-\-nh. Then putting these values and the 
values of As b f(a). A\ 2 bc f (a)...& n bc ... kl f(a) in terms of ordinary 
differences in (7), we get 

/ (*)=/(0+«A A J±f+ nh ("*+«-»> A* f(a)+... 

+ nh (nh-\-a—b)...(n/r+a—k n f 


+ „ (H 1-2 )^ 


n 



/* ~=f(a) + (%) A f(a)+(%) A : f(a) + ... + d« f(a) 
which is Newton’s Gregory formula. 

Theorem 19. The nth divided differences of a function is expre¬ 
ssible in the form of the quotient of two determinants , each of order 
in— l). 

Illustrative Exercises. 

Ex. 1. Apply Newton's divided differences formula to compute 
/(8) and J{ 15) from the following table : 


a: 4 5 7 10 II 13 

fix) : 48 100 294 900 1210 2028 

The divided difference table is as follows : 

at J(x) As Ax) Al ' f(X) As 3 fx) /LV» 

4 48 


100—^8 
T —4 


= 52 


100 


97 — 5 ^ 

T=C - 15 


294-100 9? 


21 -15 


= 1 


7 —A 


10-4 



320 


Mathematical Statistics 





294 


202-97 

50—5 


=21 


900-29 

iU—7 j 


=202 


27-21 

11-0 


= 1 


900 


310-202 =27 


1210-900 

11-10 


=310 


33-27 

13-7 


= 1 


1210 


409-310 

13-10 


=33 




2028—rio _ inQ 
2 

13 2028 

Newton’s divided difference formula is given by 

/(*)=/(*„)+(*-*<,) A/(*o) + (*-*o) (*-*!> A*/(*o) 

A'i *1*3 

+(x— x 0 ) (x—Xj)(x—x 2 ) A 8 /W 

AiX^ 3 

+(x—x 0 )(x—x^x—x a )(x—x 3 ) A 4 /(*o)+ — 

x 2 x 2 x 3 x 4 

Th.ny (t )-4«+«-4)-S 2+ (»—4)(.-5n! 


=448 

and /(15)=48+(l5-4).52+(l5-4).15 

-f(15—4) (15—5) (15—6).l=3150 

Ex. 4. Find f{x) as a polynomial in powers of (x—5) from the 


[M.A. Delhi 61) 
>llows : 

(5) 

A 3 A*) 


following data : 
x : 0 2 

3 4 

7 9 

/(*) 4 26 

58 112 

466 922 

The difference table is constructed as 

(1) 

(3) 

( 4 ) 

x A*) 

A A*) 

A 2 Ax) 

0 4 

2 26 

26- 4 

2-0 

32-11 

3-0 


58-26 

3 —z 



11-7 

4-0 


= 1 



The Calculus of Finite Differnces 


321 


3 

58 

54 - 32 =n 

4-2 




112-58 

4-3 M 

16-11 

7—2 

4 

112 

118-54 .. 

7-3 -' 6 





22—16 , 

y-3 1 

7 

4:6 

228 — 118 

9—4 




922-466 

9-7 ~ 228 

*.-22 

5-4 

9 

922 

d, —228 

5-7 81 



• • • 

••• ••• ••• 

• • • • • • 



y -922 j 

5-9 d » say 

II 

M 

m f* 

1 1 

5 

>• 

5-9 82 


5 

d 2 


P3-P2_! 

5-9 

5 


g 3 



The above table indicates that third differences are all equal 
to 1. This implies that the given data can be represented by 
a polynomial of the third degree. In order to represent the 
given polynomial in powers of x-5 , we write 5 below the 
dotted line, and then compute y, d 2 and g 3 . The table provides the 
following equations : 

gi ~~22 _j g l =»23, (from column 5) 

8y~ Sl = 1 ^2 = 21, ( from column 5) 


and 


1 

5 — 9 




di-21 S =gl =23 => d x = 182 


2-1 


(from column 5) 
(from column 4) 


^I^i=g 1= =2l with </, = 182 => </ 2 =98 (from column 4) 

y- 97 *=d =182 => y=* 194 (from columns 3) 

5 2 

Newton’s divided difference formula gives : 
t(x)=*f(x 0 )-{-(r—x 0 ) x o(*—*i) ^ 7(*o) 



322 


Mathematical Statistics 


+ (*—*<>)(*—*i)(x—**) A 3 /(*o) 

in view of which, we, on letting x 0 =*i=* 2 =* 9 =5, find 
f(x)=y+(x-5) di+(x-5y g*4-(*-5)’-1 
= 194-f-(jc—5) 98-H*-5)M7+(*—5) 3 
ie., /(*)=!94+98 (.x:-5) + 17 (x-5) 2 +(*-5)», 
which is the desired polynomial in powers of (*—5) 

18*15. Lagrange's Interpolation Formula 

Theorem 20. Assume that u ( j=l t 2,..., n ) are the n+1 

entries corresponding to the arguments aj (/=/, 2 ,..., n), which are 
not necessarily equally spaced. Then the function u x can be appro¬ 
ximated by a polynomial of the nth degree , /. e. 
u _ (x-aj) (x-a 9 ) ...( x-a n ) ^ (*— an) (x—a«). ..(x—a„) u 

x ~ (0 o -0i) ( a 0 —a 2 )...{a 0 — a n ) a 0 (a x — a a ) (a t —a. c ) -\°~' a n) 0i 


(*-g 0 ) (*— 0 ,).. (x- 0 n _ t ) u 
'* t a n ~a 0 ) (a n -a 1 )...(a n -a n - i ) <* n 


•••(!> 


(1) is known as Lagrange's interpolation formula. 

Proof. As a matter of fact, the order of a divided difference 
is one less than the number of arguments. Hence A n+1 tt 

X 10*03 ** 

denotes the divided difference of order (fl+O corresponding to 
(0-1-2) arguments x , a 0 , a *. Our hypothesis speaks that u x 

is a polynomial of degree n. Then j>n => A* u x = 0. 

Consequently 


which gives : 


^ n+l M ==0, 

xa 1 a i ...a n a ° 


M U 

_ £o _._0i_ 

( fl o — A )( 0 i> — 0 i)( 0 o — a 2 )...(a 0 ~ o,) "^( 0 i—*)(ai — 0 o)( 0 i— 0 *)-..( 0 i — 


0 .) 


Then 


-f-___ ^2 _ 

(** — X)(a 2 — a 0 ^i)( a a — Oa)• • • a —0 q) 

u 

| __ _ _ 

(^n x){(] n 

(x -a 0 )(x -a ,)(*— a z ).. = ° 


+ ••• 


The Calculus of Finite Differences 


223 


■+*-_»_ . 

(•* a iK a i °o)(°l—— O n ) 

U 

+ - -__. 

(*—o a )(o 2 -fl 0 )(fl 2 — a 3 )...(a 2 ~a n ) " r 

u 

-I-_ . . 

which assumes the form 

Ux = lx—a x )(x— o 2 )...(. y-0 

( a o a iK u o— a i)‘->(aQ—a n ) a 0 

[ (x-a 0 )(x-a 2 )...(x-a n ) 

( a i ~ a oK a i—u 2 ) • • • (o 1 — a n ) U a 1 

| 1 <*—*«-,) 

( a n a oK a n a i)“'(U n — U„_ l ) 

This proves the theorem. 

Remarks 1. (2) is also the Langrange’s formula of inter¬ 

polation. 

2. Formula (2) is equivalent to splicing the function 

_ Ux _ 

(AT-flo (x- ai )...(x-a n ) 
into linear fractions. 

3. Any one of the two formulae of interpolation, v/z, 
Newton’s divided difference formula or Lagrange’s formula can 
be used, in case arguments are not equally spaced. I hese formulae 
are also helpful in finding the form of the function u x . 

Illustrative Examples 

Ex. I. Given that log l0 654 = 2 8156, log 10 658 = 2*8182, 
logio 659=2'8I89, )ogi 0 661 =2*8202, find log 10 656 using two 
different interpolation formulae available for observations at 
unequal intervals, say Lagrange’s formula and the formula for 


divided difference. 


II. 

A. S. 1956J 

The difference table is given below : 



X 

654 x 0 

log x=y 

2 8156 y 0 

A 1 

/L 2 

A a 

658 x x 

2*8181 y x 

00065 

Xi Xq 

•00001 




=*00070 


— 000004 



324 


Mathematical Statistics 


659 x„ 2-8189 y, --0C0017 

00065 

x 3 x a 

661 x 3 2-8202 y 3 

Newton’s divided difference formula is 

yx=y*+{x—x 0 )A\+(x-x 0 )(x-x 1 )^ 2 o 

+(x-*oK*-*iX*-*2'>A 8 o+-. 

Substituting the values of A’s and x’s, we find 
>- 668 =2-8156 + (656 - 654)(* 0006 5) 

+(656-65 )(656-658X 00001) 

+ (656—654)(656—658)(656—659)(—'000004) 
=2-8156+0-0013- 0-00004—000048 
= 2 816812 

Then the interpolated value of Iog J0 656 by Newton s divided 

difference formula = 2 8168 approx. 

Lagrange’s formula 

(x—x,)(x— r 2 ). ..(x—x w ) , „ (x-x,Xx-x») ■ (*-**>_ 

y * U (V 0 - A'lX.Xo — Xo)...(.X 0 - X n ) (Xi -X 0 )(Xi— X 2 )...(Xi — Xn) 

| 4- )• ^ —^i)***^ 

(Xn — X$)(X n — Xn— l) 

(656 —658V656 —659K656—6M) 
r,w (654-6 58X654 - 659)(654—661) 

4-2-818? (656-654)(656-659)(6-6-661) 
(658-654X658-659X658-661) 

, n.o.on (656-654)(656-658V656-601) 
(659—654)(659 —658)i659—66l) 

, „ Q , A , (650 — 654X6 *0 — 658X656 — 659) 

+ 8 (661 — 654)(66l — 6>8;(66l —659) 

..o, 56 (-2)(-3)(-5) , (2K-3X-5) 

- 2 8136 (-4)(-5)F7) + 2 8 ’ b2 (4)(- l)(- 3) 

, ,.o,. n (2K-2K-51 (2 if —2I(—3) 

+ 2 8189 J5HIK-2) + 2 8202 (7)(3)(2) - 

= 0-60334+7 0455-:-6378-f 0 80577 
= 2-M68l. 

interpolated value of log l0 656 by Lagrange’s 
= 2 8168. 


gives 


Then the 
formula 


Ex. 2. Let )’o, >’j , Vs,...! 8 be the consecutive terms of a series . 


Then 


>’3=0-0: (>’ 0 +>’«)~0 3 (»i +v 6 ) + 0 75 (r 3 t-v 4 ) 

[B. Sc Agra 1956] 



The Calculus of Finite Differences 


325 


Lagrange’s formula gives : 

Ax-x 1 )(x-x i )...( x-x n ) ( x -x 0 )(x-x 2 )...(x- x n ) 

Jx J ° {>o-x 1 Xx 0 -x 2 )...(x 0 ~x n y u {x 1 -x 0 )(x 1 -x 2 )...(pc l -x n ) 

(X n *oK-*« *|)...(.X n —*r/—]) 

Letting x = 3, *0=0, x x =\, x 2 = 2, **=4, * 6 = 5, * 6 =6 leads 
to find 

v =v (3— 1 )(3 2)(3 — 4)(3 — 5)(3 — 6) 

->3 so (0-l) { 0-2j ( 0-4)(0-5XO-6) 

(3 —0>(3-2)(3-4)(3-5H3-6) 
(l-0)(l-^Xi-4Xi-o)(l_6) 

, v (3 —0)(3 — 1)(3 —4)(3 — 5)(3 — (3) 

>a (^-0)(2-lX2-4)(2-5X2H6) 

, .. (3 —0)(3 — 1)(3 — 4)(3 — 5)(3 —6) 
l4-0X4-lX4-2)(4-5i(4=6) 

(3—0)(3— 1 )(3 — 2)(3 —4)( 3—6) 

™ (5 _ e)(5 _ , )(5 _ 2)(5—4)(5-6) 

(3—0)(3— 0(3 —2)(3 —4)(3—5) 

+ - Ve ( 6 - 0 ) 16-1 (6 —2)(6—4j(6— 0 ) 

= 0 • 0 5>-o—0 • 3^! + 0 * 7 51'a + 0 * 7 - 0 • 3>'s-f 0 0 5>>e 

=0*05 (>’,,+.)'•)-0*3 CV 1 +3' # ) + 0*75 (y % +y € ) 

This is what we wished to prove. 

Ex«cises 


1. Define divided differences. Then divided differences of 
order n of a polynomial of nth degree are constant. [Lucknow 67] 

2. A divided difference is a symmetric function its argu¬ 
ments. 

3. Assume that all the intervals are equal to h. Then 




Ax U x 


* 7j" n ! 

XiX 2 ...Xx 

4 . A ',b f(o)=°a-\-b and A>bc 2 f(a)=* 1 for the function f(x)=x 2 . 

5. Establish that 

(i) zLv* 2 x 2 =x+y-\-z, (ii) Asbc 2 {\/a)=]/abc 


(iio ^ y =— f- cd 


[Banglore 60; Lucknow 65; Delhi 60] 



326 


Mathematical Statistics 






x x 


(- 1 )” 


[Beneras 67) 


6. Derive a symmetric expansion for A&c* when 

u a i u b 

& bU a~~ a —b^b—a [Bombay 68] 

7. State and prove Newton’s divided difference interpolation 

formula and then deduce the Newtou-Gregory interpolation 
formula. ,Delhi 67; Bangalore 68; Agra 62] 

8. Derive Newton’s divided difference formula, viz, 

i/*=w fl +(x—a) aX*— b) 

... + (x-a){x-b)(x-c)..\x—k) Abe- i [Bombay 60] 

9. (a) Find the th rd divided difference with arguments 2, 

4, 9, 10 of the function /(*)=* 8 -2x. 

(b) Find a polynomial satisfied by (—4, 1245), (—1» 33), 

(0, 5), (2, 9) and (5, 1355). 

(B.A. Hons. Delhi 1967) 


[Ans. (a) l;(b)/(x)~3x*- x»-6x a -14x+5 

where A-i/(—4)=-—404, A-io*/(—4)=94 
zh-ioa 3 /(—4)= —14, A—io3&/( 4)=J] 

10. Construct the divided difference table from the follow¬ 
ing data, and then find f(x) as a polynomial in powers of (x-6). 

Find also/' (6),/" (6) and/'" (6 ) 

x —1 0 2 3 7 10 

f{x) -11 1 1 i 141 561 

[Ans. /(x)=73+54 (x-6)+13 (x-6)*+(*-6) 3 ; 

/' (*)=54,/" (6) =26, /"' (6)=6] 

11. The nth divided difference of x" is l. 

12. Construct the divided difference table from the data : 

/(0)=8, f (1)=68 and /(5) =123 and then compute/(2). 

(Punjab 1957) 

[Ans. A,/(0)=60, *,*»/(>—9 25, A 8 /(0)=3; 

J( 2)= 09*50] 

13 The polynomial of the lowest possible degree assuming 
the values 3, 12, 15, —21 corresponding to arguments 3, 2, 1 and 
— 1 respectively is 

9x*+ 17x+6. (Bombay 58) 

14. The following table gives the value of x (independent 
variable) and y (dependent variable). Compute/(4) by using 
Newton’s divided difference method 



The Calculus of Finite Differences 


327 


x: 0 i 3 5 7 

y- 4 5 J3 40 53 

[A ns /(4)=20J 

15. Apply Newton's divided difference formula to obtain 
the value of ( 8 ) from the following set of values : 

x : 4 5 7 JO II 13 

fix): 2 4 8 104 114 452 

(I A.S. 1967) 

16 (a) Given that w 0 =—4, ^=—2, w,=220, 1 ^ = 546, //«=J|48, 
compute t / 2 and t/ 3 . (Madras B.Sc. 60) 

(b) Given that u 3 = 168, u 7 = 120, u 9 =72, and w Jo =63, find 

and w 8 . (Madras B Sc 1967) 

( c ) Given that w 1 . B =21*3, w 4 . a =37*8, w fl =42*7, i / 7 . 5 = 50 l 

and w 8 = 51*6 compute u b . (Madras B Sc 1965) 

17. Find the polynomial /(x) and then/ (3) from the follow¬ 
ing data : 

* : 1 2 4 5 7 9 

f(x) : -96 3 120 0 -780 0 

(Lucknow B.Sc. 67) 

18. Apply Lagrange’s formula to establish that 

y 6 =-0-2 y o +05 y 2 +y H -0 3 y l0 
and then deduce that 

^=^-0*3 0 / 9 -y_ 3 )+ 0*2 (y-t-y-t) 

[Shift the origin to 5] (Bangalore 66 ) 

19. If all terms except y 6 of the requence y lt y it . . . be 

given, then the value of y 0 is 

56 (y t +ye)-28 (y 3 +y 7 )-f 8 (y 8 +y 8 )-(v,-fy 9 ) 

70 

20. Apply Lagrange’s formula to show tht 

u 0 =h ( w i+w_i) - \\ (w 8 —«i) - m- i—w-s) 
approximately. (Agra 1968; Bangalore 65] 

18 16. General Quadrature Formula for Equidistant Ordinates. 

Define yj--J (xj) as an entry corresponding to an argument 
xj=a+jh (7 = 0 , J, 2 . n) such that 

b=x m =a+nh where h = -—- 

n 

denotes the interval of differencing. 

The general problem in quadrature is now to represent the 
function /(x) by a polynomial of a certain degree depending on 



Mathematical Statistics 


328 


the number of given entres and then to evalute the integral bet¬ 
ween desired limits The function/(x) corresponding to given 
(n-f- 1 ) entries can be approximated by a polynomial of nth degree 

such that 

A J f (x;=constant if j=n 

=0 if j > n . 

( b ra-fnX 

f (x) dx=f(x)dx 

Define / such that t= X ~~, a - . Then dx=h dt. Hence 


= /»| f(a+ht)dt 

=// j” E l (a) dt 

=h J"(l+A)«/(a)* 


in 


, +i 4+ L%-J1 a ,+ a- 


+•— + 


2 ! _ . 1 ■> 

f (/-l) . (f-n+1) 


/< i 


.* ]/(«)* 


=h [[/w+zi/w+^^r A '/ (a > 

+ ' {, " 1 , ) (> ~ 2 ) A 3 /(o)+ ... 


+ 


3 ! 

/ (/- Q...(r—n-H) Art 


n ! 


A" / (a) 


u 




! 


+ 










+ 


('4 - ns + n2 ) 


A 8 f (a) 
3 ! 


+ 


- 


*0 


Note that xj=a+jh,j=0 => x 0 =a => f (x 0 )=/(a)—>>o since 

>’;-=/ (A/)=/(a+j /0 

Then (1) assumes the form 


[ b r, \ i i T , n 2 A , /n 3 n 2 \ A 3 v 0 
= J Q /(•*) </-v=/i [nv 0 + A v 0 + - y ) y, 

+('j-'+" , )tt+•] 


...( 2 ) 



The Calculus of Finite Differences 


329 


which is called the general quadriature formula. 

Letting n=l, 2, 3 respectively in (2) leads to obtain there 
and then important rules, viz.. Trapezoidal rule, Simpson s 1/3 rd 
and 3/8th rule. 

Trapezoidal Rule 
This states thal 


J 6 J (x) dx=f a+, ' h fO 0 ^=4[ Oi+JsH- 


+ >’n-l j + •>’'» j 


...(3) 


where b = a+nh 

Proof. Letting n=l in (2) leads to find 
| / (x) dx^h ^ 3’o-b h A >’o J? 

where the second and higher differences are aegdected. 

Definition of 3'o gives 

f /(*) dx= y O’o-H'i) ...(4) 

"Phis indicates the area of one strip bounded by the coordi 

nates x = a and x=a-\-h- 

Now shifting the origin, we find 

ra+2h . /; , . v 

I f(x)dx=-~ O'l+fe), 

J a+h Z 

I a+3/» h \ 

f (x) dx= y 
«+2A Z 


Then 


r* /(X)^=f(^+d’n) 

J a-f </i—I) /> 

/(x) *=£"/«*+L /(x) Jx 

+ r“/w^+-+r + r,., 7 (x> ^ 

= -| 0' 0 +>’i)4- ^ (Vi+^)+ —+ “2 0'«-2+3’n-i) 

+ ^ O’n-l+.t’/.) 


= 5 f^O+2 (>- 1 +d- 2 +...+d’..-l)+>’n) 



330 


Mathematical Statistics 


This proves (3). 

(3) may be rewritten as 

I**"* /(*) dx=h [mean of two extreme ordinates and sum of 

all the intermediate ordinates], ...(S) 

The integration formula (3) or (5) is called Trapezoidal Rule 
Simpson’s One-third Rale. 

This says that 

( b ra+ 2 nh h x . 4/ , 

f(x)dx= I f(x)dx =-y [O’o+^-HO'i+J’s 

+ (>’*+> , 4+--+>'an-a)] “-C 6 ) 

where b=a-\-7hh 

Proof. Putting n = 2 in (2) leads to obtain 

fa+ 2 A r / 2 3 2 a \ J 2 .V„ 1 

f ( x ) dx=h |^2y 0 +2ziyo+^-3 ~t) T"T J 

=h [2y 0 +2 (T 1 -Jo) + ? /3.| (^-2 ^i+To)] 

which is Simpson's one third rule. 

Shifting the orgin provides : 

a+4h 

/(*) dx=h/3 . Oa+^’a+Js), 

a + 2/i 
a+ 6/1 

f (x) dx=hi 3. 


• • • 


• • • 



a-\-2nh 

/(*) dx=h/3. ( >’ 2 n-a + 4 >’ 2 n-i+ >’*•) 
o-f (2n-2)h 

-\~2nh f a-\-2h C a-\-4h 

Then /= \ /(x) r/-v=- \ /(.*) dx+ \ fx dx) 

Ja J a-\-2h 

a-t-6// (a+2nh 

f(x)dx=...+ \ f (*) dx 

„a + 4h Ja-\-{2n — 2)h 

= /t/h (>’0+ 4 ->’l+r 2 ) + ^/3. (>*+ 4 T3+T«) + A /3- •>»4+ 4 > , 6+^«) 

-f -,..-\-h/S. 0 , 2n-2 - b 4y an-j+^’an) 

=h]3 !0’ 0 +.V2n+4 O’l+J’a'b***+>’*n-l) 

4-2 (Va+> , 4~b'”Tan-a)] 



The Calculus of Finite Differences 


331 


This proves ( 6 ) 

( 6 ) may be rewritten as 

rQ-\-2nh 

I f (x) dx=h/3. [Sum of the extreme ordinates 

* a 

+ 4 (sum of the odd ordinates) 

-f-2 (sum of she even ordinates)] ...(7) 


/ 


2 n 
0 


Deduction. Letting a =0 and h=\ in ( 6 ) leads to find 
f (x) dx=$ [wo+4 (w 1 -b w 3 + ••• +w*«-i) 

■f-2(w 2 ~h w 4+ •••+ ^ 2 / 1 - 2 )+ u tn 

(Bombay B.Sc. 6!) 

Simpson's tbree-Eight Rule. 

This rule states that 

{ b c a + 3h 

f(x)dx= I 
J a 3 1 

and in general , 

J b ra + 3nh 3 ^ 

f (*) dx= /(*) dx = — [(y Q +y ln )+3 (>•,+>•* 

a 3 a 


/(x) dx=lh (>’ 0 -f 3>’i -f 3 >* 2 -f>’ 3 ) ...( 8 ) 


a 


-f4-^an-a“h^n-i) + 2 (^ 3 + •.. +>’3»,- 3 )J ...(9) 
where y 0 =~f (a 0 ), yj*=f ( a+jh ), j=0, 1,2,...,//. 

Proof. Putting //=3 in (2) leads to obtain 

+(t-”+ j ') fr]- 

fourth and higher order differences are neglected. 

As a matter of fact, 2>’i+ >» 0 » 

^ 3 >' 3 - 3y % -f 3^i -y 0 . 

r a-|-3/z 

Then /(x) dx=* h (y 0 +3 yi + 3y t +y 3 ) 

3 a 

This proves (8), 

Shifting the origin, we find 

,a+6h 7h 

f (x ) dx=' ir (yz+3y € + 3y 0 +y„) p 
3 a+3h 8 



332 


Mathematical Statistics 


r a+9h 

4 - 6/1 


| / (x) dx =-g- (y* 4-3j 7 4-3_p 8 +^*)» 

J fl ■ " 




>a+3nh 


/(x) dx=^- (^- s +3>v- a +3;v-i+>v) 

■* a4-(3n -3)/i 

Consequently, 

r a 4 - 3nh 

f / (x) rfx 

•>*+(3/2-3 )/1 


3/i O. 0 +3r,+3^+>-») + 3 £ (>'3+3^+3>’5+3>'«) 


8 


3/i 


+ T 0’6+3j 7 4-3j’ s +3>- 9 )4-—+ — 

+--^- 0’ 3 n-S 4“ 3>'3n-2 + 3>’3n-l 4“> , 3n) 

o 

This proves (9) 

Weddle’s Rule 

This rule says that if 6=a 4- 6/* then 


[ b f(x)dx = rV(x) rf*=y£ (y 0 +5y 1 +y i +6y^+yi+^+y^ 

Jo Ja ..(10) 

and in genet al, 

f f(x) dx f +8 "V(*) ^ = Y5 [J’o+5>’i+>’ 2 +6J’ 3 +3'‘+ 5 ^+ 2 >’» 

+ 5,, + ...] ...(H) 


I 


where />=* + 6"/r. 

Proof Letting n=6 in (2) provides : 

a *’ h f(x) dx=h {6y a +\*dy a +VA*y a +24A 3 v 0 + 1 ! 

+ !«/!“>'[.+ilo A 6 3’o]' 

where differences of orders higher than six are neglected. 

In fact, 1 iJ-J= l io. Hence in the last term of the above 

relation, an error of -p 0 /1“; 0 ' s committed, and this error is neg- 



333 


The Calculus of Finite Differences 


ligiblc if h is also sufficiently small. Therefore replacing the last 

term jfJzJVo by 1 fA®>’ 0 and substituting the values of the 
difference, we find 


f 


<2+6A 3 h 

/(*) dx= T ~ (>- 0 + 5y\ +y 2 + 6y 3 -\-y t -f 5 y 7 4- 6;- a ). 


10 

This proves (10). 

Shifting the origin, we obtain 

3// 


ra+ 12 h 3/j 

fix) dx= u . (y 6 +5y 7 +y 8 f 6 y 0 +yio-ry n +y l9 ) 

J a + 6 /i 


ra+GnA 3/j 

I /(■*) 7ft 0’«n-6 -f 5yc»-s +>'6n-i+3'6n-3+>’8«-2 

Ja+(Cn-6)/i *0 




fa+GnA fat-GA 

Then j f(x) dx 


faf 0 /» fa+ 12 A 

A*)dx + j /(.V) i/.V-f- ... 


a+GA 


Ta+CA 

+ /(*) ^ 

J b+( 6 n- 8) * 


= 3///10. [y 0 -f 5j>! +>- 2 +6j- 3 -f- 3’4 4- 5>' 5 -f^o 4*3’e 4-3-7 4-3s 4- 
^3'»+Tio+53 , i 14-3'i 2 4-••• 4-3’cw -6 4-- Jofi-s 4-3’6n -4 4-^J '6 n - a 

4” 3'6^i—«4“ ^3'6 /j i4"3 6 w] 

=3/j/ 10 • [(y 0 +3’6/j) 4- (T 2 +Ta + ••• +J’an-r) + {y* 4- 3'io 4 • • • 

4~3’o n —2^4-5 (3*1-h) 7 + •••4-3'fln-o) 


4-5 (3 , 6 - l - >'n-b-*'4->'cn- 1 ) f 2 (3'«4-3’u-f-••• 

4- )'tn—o) 4 6 (3’ 3 4-3’9'f-***4-3'en-a;J 
This proves (1 i). 

Remark. he error in any of the above formulae is given by 

E= \ a f(x)dx-j* P(x) dx •••(!2) 

where P(x) is a polynomial. 

Weddle’s rule of numerical integration is more accuiate than 
the Simpson’s rules 

IndepenuciH Proofs of Simpson’s Jrd and f fli Rules 

Simpson’s $rd Rule Assume y 0 . y x and y 2 are entries corres¬ 
ponding to aiguincnts x^-0, 1 and 2 Lei the interval of 
differencing is unity. I hen fix) can be approximated by a second 
degree pol>nonnal Therefore, assume 

J(X)'=U + bx + jx* ...(]) 

Then J ^ J(x) dx=j (a f bx -f cx *) dx 



334 


Mathematical Statistics 


...( 2 ) 


= 2 fl+ 26 +|c 

Letting *=0, I and (2) in (1) leads to find 

y 0 =/ l O)=fl, yi =f(l)=a+h+ c and y 2 =AD=a+2b+2c, 

from which we obtain 

a =y 0 , 6=4 (4*- 3 *-*), c=4 
Substitution of the values of a, b , c in terms of y 0 , yi an a 

in (2) gives : 

j a /(*) Jx=2^o+(4y 1 -3^ 0 -> , 2)+|-H3 , 2- 2 >'i+>’o ) 

O'o-b^i+.y*) 

which is Simpson’s £r i rule. 

Shifting the origin, we have 

4 f(x) dx=\ 'OyM*+*> 

a 


i 


••• 


P" /(*) dx~\ (y % n-3+4ytn-i+y^ 

J s«-a 


ff“ /<*> dx 

J 3n-a 

Jo =4° i (> , o +3'««)+ 4 (*+%+*+ -+ y ^-' ) 

+ 2 (y*+^«+ — +^fn-i)] 

which is the generalisation of Simpson’s $rd rule. 

Letting y=hx implies dy=hdx. 

Then p/00 jp=l (O'o+>’ 2n )+ 4 0'i+>’«+3'»+-+3'«"-^ 

+2 0',+y 4 + — +>«-«)] 

Thus p Ay) dy=y t(fo +>'»»)+ 4 0’i+J'.+3’s+~+*»- l) 

Consequently, 

2 n/j / 

f(x) dx--Y [O’o+ttnH 4 (yi+3 ? a+—-+^an-l) v 

0 +2 0yH>«+~ f 

This is the generalisation of Simpson’s one third rule o 
numerical integration when h is the interval of differencing. 

Simpson’s 3/8th Rule. Suppose y 0 , y t , y 8 and ^3 arc entries 
corresponding to arguments x=0, 1, 2 and 3. Then we wts o 

evaluate \ /(*) dx in terms of y 0 , >>i, y* and 

Jo ' • 



The Calculus of Finite Diffences 


335 


Doubling the scale implies evaluation of C 6 /(*) dx in terms 

r J° 

° >Wa, y x and y 9 . Shifting the origin to —3 allows to find 

) _3 /(X) dx in terms of y-*' y-Ti> and J- 3 . Hence we are 
justified in assuming 
3 

—3 dx=A (w-r+w,)-!-^ (w_g-f i/ 3 ), ...( 1 ) 

where A and B are constants to be determined. 

Suppose 

/(*)= a -f- bx+cx* -f dx 3 , 

in view of being given four entries. 

Then y-t+ yi =f(~ l)-f/(i) = 2 (u+c) 

and y-3+y»=A-3)+fO) =2 (* +9c) 

Therefore, 


5 _ 3 A*) <*»=^ 2 (a+c)+fl-2(a+9e)= ^ 3 




Then «+*£+“! 


r ^ 4 ) 3 

4 "| =2^ (n+c) + 25(a + 9c) 


or 2 (3a + 9c) =2/t (a+c) + 2B (a-f 9c) 

to find eD,lfy D8 thC COefficients of * and c °n both sides leads 

3 = /l+5 
9 --A-\~9B. 

Solving these equations, we find 

A = 9/ \ 5=3/4. 

Therefore ( 1 ) becomes 

J 

— 3 dx== ^ ^ 3 (w-i + w^-f (w_ 5 -f w 3 )j 

so that on shifting the origin 
■* 6 

Q /(•*) dx= i [3 (W,-|-W 4 )-f (tf 0 + Wj)] 

Changing the scale to \ claims that 
'3 

Q fix) d x=l [3 



33.6 


Mathematical Statistics 


= 8 (« 0 4- 3wj.+3 w a +w 3 ) 
f 3A 3 h 

Evidently \ /(x) </x=-g-(a 0 -f3« 1 +3w 2 +t/ 3 ) 

(where h denotes the interval of differencing) 

Which is the requested formula for Simpson’s 3/8th rule. 

Illustrative Examples 

Ex. 1. A river is 80 /set wide. The depth d (in feet ) of ihe 
river at a distance x from one bank is given by the following table : 
o 10 20 30 40 50 60 70 80 

d\ 0 4 7 9 12 15 14 8 3. 

Compute the afea of cross-section of the river approximately 

(Delhi 1961; Bombay 57] 

Simpson*s £rd rule provides : 

80 C &0 

Area= f(x) dx= 10 \ u x dx 

JO 



= 10-£ [(m 0 +!/ 8 )-M (w 1 -hi/3+..*+w 7 )+2 (w 2 +u 4 +wc)] 
= x s [(0+3)+4 (4+9+15 + 8)+2 (7 + 12+14)] 

—™ [3+4x36+2x33] 

(213' = 710 square feet approximately. 

Ex. 2. Compute an aoproximate value of a 

v/2 


sin x dx 


0 


(/) by the Trapezoidal rule 

(//) by Simpson’s rule using 11 ordinates. 

We shall divide the raoae (0, vr/2) into 10 equal parts to 
obtain /i=n/20. Using trigonometrical tables, we have 


y 0 =sin 0 = 0 ‘ 0000 , 

j> a =sin «t/ 10 ■= 0 090, 
y 4 =sfn 2n/10 =0 5878 
j- 6 = sin 3r/10=0‘8090, 
v 8 =sin . 7 t/5 = 0'95 11, 

; - io = sin + 2 = 1 0000, 

The Trapezoidal rule gives : 


_j» 1= sin 7 t/20=0’1564 
>'3 = sin 3^/20 =0*4.5 10 
_y 6 =sin w /4=0’7071, 

y 7 =:sin 777 / 20=0 8910, 
j > 9 = sin 9 t?/20=0'9877 


771 ' sin x </x=4 1 (To+Tjo)+2 + 


0 


= 77 /.0 [J + 5-8531] = r./20-(0‘5-h5-S31) 

= 0*9981 


The Calculus of Finite Differences 


337 


This is the approximate value of the integral of 

r7T*2 

J Q sin x dx by the Tropezoidal rule. 

Simpson’s $ rule provides : 

f "/2 

J q sin x «/^=A/-.Ly 0 +3'i 0 )+4O» 1 +>,+ ...+y 9 ) 

, Kmn ^ „ + 2 O^+^+ye+ye)] 

= (w/60) [ 1 + 5-3138 + 12*7848]= 19098*6 x 0 0524 
— 1-0006. 


Infact 


r 


/2 


sin a: </x= —cos a: 


n/2 
'0 


= 1 


The error in computation by Simpson’s £ rd. rule is 0-0006 

m excess and by Trapezoidal rule, the error is 0 001* by defection 

is implies that there is a greater accuracy in the computation 
by Simpson’s rule. 

Ex. 3. Apply Simpson's rule to compute 

dx 


L 


Jo 1 +-X 2 

and then f ind the approximate value of n in each case. 

(Agra B. Sc. 1964] 

Simpson’s 1 / rd. and 3/8 th rules are both applicable only 
when the range (0, 1) is divided in such a number which is divi¬ 
sible by 2 and 3 both. Then we have to divide the range (0 1) 
into six equal parts. Evidently h=\/6. 

Diline yj : y/=/ ( a+jh)=f (jh) since *=0 and 6=1. Then 

>’o=/(0) =1-0000 

*-**>-TTVSr-%-' 

= 0 90000 

>'»=/( 36)=:-!- 

1 46 


0 97297 


36 

= =45- = °- 8000 0 


*=/(4A) = = _ =0-69231 

^=/ (5A ) = _i 

>■»=/ (1)_ J_ = J aO . 5 Q LO0 


_ =, 3 A 

76* ""’hi 


= 3 59016 



Mathematical Statistics 


Simpson’s £rd. rule provides : 

(V*- j“ + “ W *-<,/!».+»>+* <»+»+*> 

=0*785395, after simplification 

But fo Mo iT^ =tan_1 * 

Then,=4 T /(*) ^x=4 (0*785395) 

JO 

= 3-1415*8. 

By Simpson’s 3/8tb rule, we find 

t‘ /(x) dx = [ a+6/ ' /(X) d* = (3/8) /. [(*+r0+3<*+* 

° *+>•»)+2y»] 

— 1/17-ri 50000+3 to 97297+0-90000 

4-0 69231 +0"590l6) + 2 (0 80000)] 

=0 785395, after simplification 
Then w =4 10-7853951 = 3'141580. 

ex. 4 . Apply Simpson’s rule ,o eroUmie \\ ** fc -** * ** 

** - ^ 2 ° ° 9 ipST 

^ 'our' hypothesis provides five entries : £ £ 

Fhen we are eligible to apply S.mpson s i rd. role (the 

divisible by (2). Clearly /* —1. 

By Simpson’s £ rd, rule, we find 

J 4 e r j ^^+^+4 Oyf^R 2 ?*! 


In fact, f 
J 0 


:$[!+54-60+4 (2 72 + 20-09)+2 (7-391)] 

=\ [55 60 + 91-24+14-781 
= £ (l61-62) = 53-873 v 

e* dx=e x 1 =e 4 -e° 

0 


The Calculus of Finite Differences 


339 


Then the error=53 873—53 60 

=0-273 

1 dx 
o 1 ~hx 


Ex 


■ S L 


,= log 2=0-69315 


[Meerut 1970, 69, 68] 


find 


For, -e divide the whole range (0, 1 ) into 10 equal parts. 
Evidently /(*)« ]/(i + x ) 

Define xj : x J =x 0 -\ r ~ where * 0 =0. This implies that 

*•-&’ and *=ii=°'i 

Obviously f(xj)= —?— =,— 

1 ~hxj 1 +(y/10)*. 

Now the application of Simpson’s one-third rule leads to 


/•l rX o-bl< 

Jo J*n=0 


Xq -{* 1 Oh 

fix) dx 

x 0 =0 

=A/3. [ /( Xt >) +/( x 0 + 1 Oh) - 4 { f(x 0 +h) -f f(x 0 + 3 h) +... + 

)(x Q + 9h )} + 2 { f(x 0 + 2/i) +... +/(* 0 +8 h)) 

= 1 -[ 1 + |-f 4 ( n + p3 + r 3 + i^+ I ^)-|-2 (p-2 + p^ 


+ 


= 1/30 [I-5+4x 3 459584-2*7281 S] 

= 1/30.(20 74456] = 0 693I52—069315 

,n fact ’ j! IT7 =]og ( I+x) | 0 =,og 2 


1*6 1*8/J 


Therefore, log 2=f* ~— =0 69315. 

Jo i +x 

This is what we wished to prove. 

Ex. 6. Compute the value of the integral 

1 - 6-2 

J log, x dx 
by usiog Weddle's rule . 

To apply Weddle’s rule, it is necessary that the number of 
sub-divisions must be a multiple of six. Then we shall divide the 
interval of integration, viz. 5 2-4 = 12 into six equal parts of 

width 0-2. Evidently /i=(1-2/6)=0-2. The value of log x-for each 



340 


Mathematical Statistics 


point of 

subdivisions is given as 

follows : 


log, x 

X 

4*0 

1-38629463=^0 

4*8 

4-2 

1*43508453=^1 

5 0 

44 

1-48160454=^2 

5*2 

46 

1 52605630=y a 



Weddle’s rule provides : 


log, x 

1 568661592=^3 
1-60943791 =>> 5 
1 64865863 =>'• 


( 3+6 /(*) *=^[0’o+>’s)+5 (>’ 1 +>' 6 )+(>’2+>’4)+6>'3] 

__3_(0^2) p.03495299 + 5 (3 04452244)+3*05022046 

10 ~ +6(1*52605630)] 


= 1*8278471. 

Ex. 7. U x =a+bx=cxK Then 

J* Ux dx=2U 2 +-~ (U 0 -2U % +U t ) 

r>/2 —-—j 

and hence find an approximation to e 10 x 

(L. U. B. Sc. 61, D. U. B. Sc. (Hons.) 68, D. U. M 

Shifting the origin to —2 reduces the above result to 

f 1 u x dx=2U 0 +\/\2.{U-*-2U 0 +U 2 ) 


For, L.H.S.= r f{x) dx= ( a+bx+cx *) dx 

But /(x)=a+6x+cx 2 , gives 

y 0 = = a t y_ 2 =a— 26+4c, y+t=a-\-2b+4c 

Then R H.S. of il) = 2 a+y^ (2a+8c-2a) 

=2a+ 1/12.8c=2 ( a +c/3)=L.H.S. 
Changing the scale to £ leads to find 

J 1 * t/,rfx=-i- ^2£/,,+ i-(U-+yi- 2t/ »'] 



The Calculus Finite of Differences 


?41 


= t/ o+ 24 -(^-i- 2 C/ 0 +i/i) 


•2 


then 


Letting f(x)=e 10 gives y 0 = l 9 y x =e~y_ x =t~^, and 


f 


1/2 _ * 2 

e fo" dx= 1 + 1/24 (e- 1 /*°-f C -i/io_2) 
-i/a 


-' + f7T2('"'"- 1 ) 


( 


This is what we wished to find. 

Ex. 8. Prove t hat 

‘ U x dx=-±j [13 (1/,+ t/-,)-(£/ 3 +C/_3)[ 

-1 — ( 1 ) 

[A. U. B. Sc. 63] 

The right nand side indicates that there are only four entries 
i.e. U x U_ x% U 3 and t/_ 3 , tnen assume U x as a third degree poly¬ 
nomial. Hence 

Suppose U x —A-\- Bx-\-cx 2 + Dx? 


Naw L.H.S. of(l)=| 1 i (A + Bx+Cx' + Dx*)dx 


...( 2 ) 


-[ 


= 1 Ax-h 


?+ c T+‘>TL- 2 (-‘+r) 

From (2), Ui = A-\-B-\-C t U_ l — A — B — C—D 

U 3 = A + 3B+9C+27D, U. 3 =A-3B + 9C-27D 

Then R.H.S =j^- 113 

= jy [13 (2A + 2C)-(2A + \iC)] 

= jy [2<M + 8C] = 2^ + £] 

= L. H- S. 

Thus we are through. 

The Euler-Maclaurln’s Summation formula. 

Theorem. Define xj : x t = x 0 + jh, where h = ■ , and j = 0, 

1 2 n. Let f{X)) be the entry corresponding to the argument 
xj. ’ Then the Euler-Mac/aurins summation formula is given by 

_L f f(x) dx= Z f(Xj)-hlf‘X„)+f(x„) 1 

h J 



342 


Mathematical Statistics 


_4i [/' (*O + « / 0-/' (JTo)] +J^)lf" 

Proof. The basis of formula (1) is the expansion of operators. 

To establish (1), assume A F (x)—f 

Then the operator A~ l is defined by 

F (x)=A~ 1 f(x). 


Evidently AF (*o)=/ ( A ”o)• 

Then F ( Xl )-F (*„>=/ <*o). in view 

The similar argument leads to obtain 

F(x i )-F(x 1 )=fix 1 ). 


of definition of 



• • • 


• • • 


• • • 


Thus 


F (X n )—F (*n-l)~ / (*«-!>• 

F (x„)-F (x 0 ) = F (x„)—F (Xh-i)+F F (W 


+ ...-^(* 1 ) — F(x 0 ) 


V / (xj) 

J~ 0 


..( 2 ) 


%n Xq 

n 


where*/=x 0 +^» h— 

Now F (x)=d“ 1 /(*)=(£- 1) -1 /W 

= ( e h0 -\Y x f (*). where E=e 1 


r / fw% /i 2 Z> 2 /i 3 /)* 
= H 1 -\-hD -f- h 


2 1 


/ //2Z)2 /l 3 Z) 2 

= UZ)+-~r- 

= {hD)~' 


2 ! + 3 I 


3 ! 


+ 




[ l+ (n 
[- 


hD /j 8 D 2 


+ 


+ ]'/(*> 

...] fix) 

‘ /(*) 


/hD /i*D 2 
\2 ! + 3 


-- 1 


(-1) (-2)//;D A*D« 
2 ! \2 ! ‘ 3 1 


■1...j/(x) 

1 rv-. T 1 /,£> h2D2 h * D * 1 r, x 

- h D L 2 + ir-o‘ + *-J /(x) 

v /I D * + 12 ~ 720 + **^^ ) 

The fact that D~ l is inverse of D implies that D~ l is the in 
tegral operator. Then 



The Calculus of Finite Differences 


343 


F(x)= t \/(*>*-»/(*>+jy + - 

.•• 0 ) 


Inserting x=x* and .*=*„ in (3) leads to find 

FM-F /W/x-1 [/(*«)-/(x 0 )] 

Jx 0 


+ ir[/’( jr * )-/ '( Jfo) ]- 


•••(4) 


By means of (2), (4) assumes the form 

x, 


V f(xj) = l/h. f / (x) dx— J [/(x„)-/(*„)] 

J x 0 


+ jy[/’(*»)-/' Wl-^r !/'"(*.) 


which gives : 


-/ (*)]+..., 


l r x » 

«-L /w 


dx 


■-1 


27 / (*;) + H / (*») +/ <*o)J — 'M 

J -o « 

/i* 




720 


..(5) 


n—I 


/i 


In fact, 27 /(x y )=27 f{xj)—f (x„) and x„=x 0 -f n— h, 

<“o y«o 

Then (5) becomes 


I /-x 0 + nh 


r* o 


f(x)dx= 27 /(x,)-* [/(x„)+/(*o)] 


y-o 

-jy[/'(*>+«*)-/' (*o)]+ 7 20 [/"'(*°+ n/ ') 




This proves (1) 

Illustrative Examples. 

Ex 1. J* w* </x=^- ( 5 w i+8i/ 0 -m_,), approximately. 

(Agra B.S 1965; Bombay 68) 
The problem indicates that we are given entries w_ I( w 0 and 



344 


Mathematical Statistics 


U t . Hence we assume u, to be a polynomial of degree 2. There- 

fore Lagrange’s formula provides : 

u *= ( Jl_cx-i-T7 " -1+ iu-(-D) ' 0_1J 

(x _ { _nHx-oL Uu 

+ "0=(^ 

x (x — t) .. , J*±ii<*ri} w + ( £+li^ ai 

"—priFT) 1(_1) , 

= 1 ( X *—X) U X — {X 2 — 1) Uo + H x2 + X ) W 1 4 » l A tn 

Integration with regard to x between limits 0 ani 1 leads to 


i.e. 



=Y2 (5«i+8m 0 — M -i) 

This proves the desired result. 


Ex. 2. 


Establish the approximate formula, 
i . 13 (u,A-u_.( — (Ua-t-M-al 

dx = -12- 


by using Lagrange's formula. 

For, Lagrange’s for arguments x=— 3, —1, 1, 3 provides : 
(x+1) (x-1) (JC—3) , (x+3)(x-lVx-3) „ 

u * (—3 + l(X—3—l) 1-3-3) - s+ (-1+3X-1-1X-1-3) 

(x+3)(x+IWx-31 , (x+3)(x+n(x-l) .. 

+ 0 + 310 + 1)11-3) 1+ (3 + 3) (3+1) (3-1) 3 

= (x+1) (x— 1) (x—3) «-3+j^(x+3) (x+1) (x-1) U-, 

“lV (x+3) '*+ 0 (•* —3)+^j-fx+3)(x+ l)(x— 1) "-a 


= _I_(x»-3x*-x+3) u. 3 +i( x 3_ A: a_9x+2) u. x 

— ^(x 3 +x 2 —9x —9) u I + 4^(x 3 + 3x--x-3) u s 



The Calculus of Finite Differences 


345 


Then 


| u u dx = ^—48 ^— x* — x*/--\-3x ^ w_* 




= 2+6) + jV(-?+ ,8 )“-> 

-^(!->8) Ul +^(2-6) Us 

• ,>3 ,13 1, 

— rr w -3 + [2 12 Wl ~~ 10 


12 


12 


I 


=yy[ 13 ^H-U-O-^a+W-a)]. 


This is what we wished to show. 

dx 

0 1 + ■* 

correct up to five decimal places by 
Euler-Maclaurin's formula. 


Ex. 3. Evalute 


f. 


(M. A. Punjab 1957) 


Euler-Maclaurin’s formula gives 
j ,x 0 +n^^ dx== £ f( Xj )-\ [/(x n )+/(x 0 )) 

Xo Jm ° 

_JL [f xo +nh)-f (x 0 )]+j^[f" (xo +«*)-/'" (x 0 )l 


Let x=0, x 0 +nA=l and h=0 1. Then *=10. 

By hypothesis, /(x)=j-^-, which, on differentiation, 

gives : 


Then 


f'(x)- (1 + X j;j >f ( x ) (1+x) 4 

ir 1 

nJo 


dx=(\ + 1/1*1 +1/1 *2+... +1/1 9) 


(i + 0) - 0 1/12 [-1/22 + (0-1 )»/720 [ — 6/2 4 + 6/P] 
c=0-5 + 0 Q 0909-f 0 8333 > +...0 , 25000 
—1/120 x 3/4+ 1/1200COX 15/16 
= 6 93773-0 00625+0-00001 
=6*23149. 

Therefore. J‘ ^*=0 69315 approximately. 



346 


Mathematical Statistics 


Remark, f p— dx= log c 2=0*69315 up to five decimal 

Jo l~r* 

places. 


Exercises 

1. Establish the quadrature formula : 

i; + y dx=h 


^ Vo 
2 ! 


+ n 3 +n* j ^4 5 -j + ...+ (o (w+1) terms 


(Agra B.Sc. 66, 64; I. S. S. 67; Sardar Patel 

University B Sc. 68) 

2. Discuss in brief the (a) Trapezoidal rule (b) Simpson’s 
rules for numerical integration. 

(Lucknow 63, 65; Delhi 67, 66, 61; Sardar Patel 68; 

Vrk. University 68, 66; Agra B.Sc. 62) 

3. Assume >>=a-f bx+cx 2 and y 0 , y x and y 2 are the values of 
y corresponding to x=a, a+h, a+2h respectively. Then 

f° +2A h 

\ a y dx= -5~ ( y°+ A yi+y*) 

4 . Show that 1*62 is an approximate value of 

5 dx 


\ 


By Simpson’s $rd rule where h=>\* 

5. Show that the value of it from 

1 dx 


i. 


o l X 2 

by Simpson’s one third rule, decomposing the interval into four 
equal parts, is 3 4116 

6. Apply Simpson's one thi(d rule to find 

8 dx 


i 


(1 + *^ * 

(Agra B. Sc. 68; Lucknow B.Sc. 65) 

7. Compute the approximate value of f x 1 dx by using (a) 

J-3 

Trapezoidal rule (b) Simpson’s one third rule, by dividing this 
range into six intervals Compare ihese values with the exact 
value. (Agra B.Sc 64; Delhi B.Sc 67; Bangalore B.S. 68) 

Ans (a) 115, (b) 98; actual value=97*2. 



The Calculus of Finite Differences 


347 


8. A rocket is launched from Cape Kennedy. Its acceleration 
is registered during the first 60 seconds and is given as follows : 
Time t (sec) : 0 10 20 30 40 

Acc. :/(m/sec 2 ) : 30 00 31 *63 33 44 35*47 37*75 

50 60 

40 33 43 25 

Find the velocity of the rocket. 

Ans. 3087 m/sec. 

f i « 

(sin x —log c x-J-e*) dx 

0-2 

by (a) Trapezoidal rule and (b) Simpson’s l/3rd rule 

(c) Simpson’s 3/8th rule and (d) Weddle's rule. Compare 
these values the actual value of the integral. 

[Delhi M.A. ^Statistics) 68; Agra M Sc. (Stat.) 64, 65 

Poona M A. 57] 

Ans. (a) 4 05617, (b) 4 05107, (c) 4 0516 and (d) 4*0.098. 
Actual value is 4*05095. 

10. Apply an approximate integration formula to find the 
value of < - 




Ux dx 


using the following data 
x : 0 1 2 3 4 

ux : 0*146 0*161 0 176 0*190 0*2q 


5 6 

4 0 217 0*230 


(Agra B.Sc. 67) 

11. Apply Simpson’s one third rule to compute the value of 
the integral 


i 


6*2 


from the following data : 
x : 40 42 


« X dx 


1 

I 

i 


4*4 

4*6 

I 48 

5 0 5 2 

1 4816 

1*5260 

11 5686 

1*609! 1*6486 

ct value of the integral. 

(Delhi 1968) 


12. Apply Simpson’s one third rule to’pnd an approximate 
value of log,. 2 Irom 

J‘T and l! ’ 

taking 5 ordinates. \ (Agra B.S 62; ^8) 

13. Apply Simpson’s rule to show that 




348 


Mathematical Statistics 


i 


log (1+X 2 ) 


' ^=0*1730 


i 


0 1 +X Z 

[Mysore B.Sc. 64; Sardar Patel University B. Sc. 67] 

14. Apply the approximate formula 

•i i 

u x dx= T r (5wi+8w 0 — u -j) 
o tz. 

to find the appropriate mileage travelled between 12*30 hours and 
12 40 hours from the following data : 

Time : 11*50 12*00 12*10 12*20 12*30 

Speed (m.p.h) : 24*2 35*0 41*3 42 8 39 2 

15. Prove that 

Ux dx = 11 ( w - 2 + w a )—* 4 (w_i+ w i)+26 i/ 0 ] 

(Beoaras M Sc. 68) 

16. Let u x =a+bx+cx* + dx !i +tx*+fx s . 


i 


Then J u x dx= 2*2 u e + 1*62 (w_a-f-w 3 )+0*28 (w_ 2 —Wg) 

or J u x dx = 0 28 (w 0 +« a )+ 162 (m 1 +w 6 ) + 2*2 u a 

This is called the Hardy's formula. 

Apply jhis formula to obtain the value of 

m il* 

(1 — x)-*! 2 dx 


. 


16 Sum the series : 


(a) 


1 


1 


1 


51 2+ 53 2+ 5 2+ ”’ + 99 2 


[Ans. 0*004998 ] 

o») 1 1 1 


100 + 101 + 102 + 103 


L,_L 

‘ 104 


[Ans. 0*0490291] 

. 1 1 , , 1 

(C) -1(M D “f" i..T»"T. m T 


201 2 ‘ 203 * 


29- 


(Delhi M.A. 1950; Punjab 50) 



19 

ANALYSIS OF VARIANCE 

Further applications of F distribution 
19 * 1 . Analysis of variance . the separation of the variance 

asciibable to one group of causes from the variance ascribable to 

other groups'. It is a procedure by which the variation embodied 
in the data of the sample ma> be resolved into component varia¬ 
tions due to independent factors. Each of the component yields an 
estimate of the population variance, and these estimates are tested 
for homogeneity by means of the Fiable 

Consider a random sample of N values of a normally distri¬ 
buted variable x. It is frequently possible to arrange these in 
classes according to a certain factor or criterion. If the variable 
of the crop yield of a variety of cereal, the classes may correspond 
to different manurial treatments The classes are /=1, 2, 3,..., h. 
The frequency of the ith class is taken as n { and x t) denotes the 
value of the yih number in the ith class. Let Xi denote me mean 

in the /th class and x the grand mean. 

Thus Mx=Z Z x t) , Z Z (x tJ -x) = 0 

i 1 I J 

n t Xi=Z Xij, Z (x ii -x i )= 
i i 

Observe 

Z xu-x) 2 = Z (x„-Xi + x,-x) 2 =Z (x, J -x,)' t +n i (x, - x) 2 
i J • 

or Z Z (x,j-x) 2 =Z Z {xu-XiY + Z n, (x,—x)* .••(!) 

i j * J 1 

This formula holds, whether the population is normal or not. 

19 2 One-way classification. Suppose that the laccor ot classifi¬ 
cation has no effect upon the value of the variaie Then il the 
population is divided into classes according ta this facior, the 
different clashes will have the same statistical properties, that is, 
fj. and o* will be the same as in the population. From the various 




350 


Mathematical Statistics 


sums in (1) we can obtain three unbiased estimates of f* 2 . 

E[Z 2; a 2 ^Recall E E (x,-x)*]=°* 

E [s (Xiy—ii)*]=(n/—1) o 5 =s-£ [E S («,—1) o* 

1 i J 1 

•=(N— h) o % > since S ni—N , and i=l, 2, 3,..., h. 

/ 

E [27 27 {x i} -x)*\=E [2 2 (x„-x,)*]+ E [2 n t (x/-x) 2 ] 

i j i j ‘ 

=>E [2 m (Xi-xy]=(N-\) o 2 — (N— h) a z ^(h- 1) a 2 

The degrees of freedom of the various sums in (l) are addi¬ 
tive, since ( N - 1) <s 2 =(N—h) o z +(h— l) a a . 

The above results are usually tabulated as follows : 


Sources of variation 

D. F. 

Sum of squares 

1 

Mean square 

Between class means 

h -1 

2 n, (x,—x,) 2 

i 

27 rti (Xr-xY/Ji— l ) 

1 

Within classes 

N—h 

2 2 (x„-x,) 2 

* i 

Z E (.xn-xd'KN-h) 

i . 

Total 

N— 1 

27 27 (xi / x) 2 
/ > 



Note u e items are additive in the columns headed ‘D. F * and 
‘sum of squares*, and not in the last column which gives the 
estimates of a 2 . 

In the case of a homogeneous normal population, 

2 2 (. v< ,-x) 2 /a 2 ~XVi 

i J 

Z (*„-*,)»/o*~X* ,*ZE {x,j-x,)Va 2 ~Z X 2 

j n i— l / j i n { — i 

=X 2 ^ ^, because of the additive property of X 8 distribution. 

Similarly 27 *U (X/—x) 2 /c 3 ~X 2 /,_, 

/ 



Analysis of Variance 


351 


Thus *v-r x V-ft +x, *- 1 


, when the various sums in 


(l) are divided by a 2 the variance in the population. ^ 

In order to test the homogeneity of the estimates of a* by 
means of the variance ratio and the F table, it is necessar) to 
assume that the population is normal. I he practice i> to compare 
the estimatic ‘between class means’ with that obtained wit in 
classes'. The sum 2 ", (5,-S) 1 reprerents the variation due to 


i 


the factor of classification , and 2 2 (x<> — *i) 2 * s *h c residual \aria 


tion after 'the former has been removed. If the estimate obta ned 
between classes is signilicantly greater than that within classes, 
we are justified in conclud.ng that the factor of classification 
exercises an influence on the value of the variable and the 
population is heterogeneous. II the estimates of a- aie not signi¬ 
ficantly different, the test provides no evidence against the 
hypothesis ot a homogeneous population. Let us no.e that the 


two 


estimates of a 2 


provided 


by .— r Z n, ( xt-x )* and 

/I — 1 i 


—r— 27 27 (Xij — Xi) 2 are independent, since the mean of a random 

iV — h t f 

sample from a normal population is distributed independently of 
its variance. 

19*3. Two way Classification. Let the N values x,y be classified 
according to the cr.terij A and B. Suppose A determines n 
different classes and B determines k different groups. Now 
hk [N=hk] values of the variable are such th t in each ol the /i 

classes there is one value Irom each group, and in each of the k 

groups one value from each class. The classification has h column 
and k rows according to the criteria A and B. x t j denotes tue 

value in the/th class and ./th group. Let X denote the general 
mean, X/ — the mean in the /th class, and Xy —the mean in the jth 

group We resolve the sum Z Z (x,y—x<)* with respect to the 


means xy of rows. 

Let Xi f = xif — Xt, that is, each value of the original 
is diminished by the mean of the column in which it lies. 
Then Z Z X t j = 2 2 (xtj-x t ) = 0=>x^0 


variable 



352 


Mathematical Statistics 


i h | _ 

Similarly, Xj=-^- 27 X t j=-^ 27 (*/;— xi)=xj—x 

Now 27 27 {Xn-xY^ 27 (Xij-Xj+X j-x) 2 
i j i J 

=27 27 (Xu-Xj?-\-Z h Q£j-X) 2 . 
i j J 

or its equivalent 

2 27 (x iJ -x i y=2 7 27 (x tJ —x,-Xj+x) z +2 h (x } —x) 2 
I j i J i 

Substituting this value in (I) and remembering m=k for 
every i, we have, _ 

27 £ (x if -x)*=2 27 {Xij— */—x,+xj*+27 k (x«—x) 2 +27 h (*,—*) 2 

i J * J * ■> 

...( 2 ) 

Taking expected values on both sides, we get 

(/»&—1) a 2 =(>i-l) (A; — 1) o 2 -H/i— ) ° 2 +(^—1) o* 

[Since ihe two sides must give identical values] 
Dividing (2) by a 2 , we get 

X *hk -1 = *\/2-lK*-l) +X *h-l +X V-1 

Uncontrolled variation is represented by 27 27 ( Xij — Xi— xj+xj 2 , 

t J 

It is termed as ‘error*. We compare the estimates of variance 
with that of error. The population must be assumed normal lor 
the application of Stable. If either estimate is significantly 
different from that obtained from error, the hypothesis ot homo¬ 
geneity is discredited. The significance of difference between any 
two classes or two groups may be tested by /-table. The above 
results are usually tabulated as follows : 



Between classes h —1 


Between groups k—1 


Error 


27 k ( Xi-x)* 

J 

27 h (Xj— x) 2 
J 


\(h — l)(fe— 1)27 27 (. x^-Xi-Xj+x , 2 , 

I I' J _‘ 


The quotient 
of ‘sum of 
squares by 
D.F. in each 
case. 


Total 


(hk-\) 


2 2 (x„-x)* 

I J 




Analysis of Variance 


353 


For the case of calculation we employ the following trick. 

Wc have 

E (x s -xf=E x i —Nx 2 =E x\-(NxY!N=S x* s -T*/N ...(3) 
s s s s 

where 7’= sum of numbers 


/. E E ( x tJ -x) 2 =E S.x\j -( T*/N ), 
i j ' J 


where 


T=E E Xij 

i ) 

E E (*ij —Xj) 2 =E E (x\j -Tf/n,). 
t i i J 


~( 4 ) 


Then 7*,=27 x it =sum of the numbers with ith column. 


Then 2 E (*/;-*, ) 2 =i7 2 x^j-E Tfln, 
i j i ) ' 

From (4) and (5) 

E n (y —Z —— Ti 
f n, (Xi-x)'- f— ^ 

or alternatively; 

E n, (Xi—x)*=E *i *< 2 —(27 n, xfflN 

i i i 


...( 6 ) 


_ y 7? T 1 
i n t N 

Since the deviations from the means are independent of the choice 
of origin, the above results will remain unaltered by a change of 
origin. In other words, if all the values x,j are decreased (or in¬ 
creased) by the same constant, the values obtained for the three 
sums of squares are unchanged. 

Ex. 1. The following table gives the results of experiments on 
four varieties of a crop in 5 blocks of plots : 




Varieties 



A 

B 

C 

D 


32 

34 

31 

29 


34 

33 

34 

26 

Block 

33 

36 

35 

30 


35 

37 

32 

28 


37 

35 

^ ■ ■ 

36 

29 


Prepare the table of analysis of variance to test the significance 
of difference between the yields of the four variaties. 

(M A. Punjab 46] 



354 


Mathematical Statistics 


The clasification is according to variaty. The number of 
classes is h=4, and the number of items in each classis w< = 5. 
Consequently W=4x5 = 20. T he arithmetic is simplified by shift¬ 
ing the origin to (say) x=32. Diminishing all the yields by 32 we 
may rewrite the table : 

0 2-1 -3 

212-6 ' 

14 3-2 

3 5 0 —4 

* ^ 4 —3 


1 ^, 

11 etc.] 


from which we have 

r ( = ll, 15, 8, -18, r-16 [r=ll + 15+8-18 = l6, 

7i=0+2-l-1 +3 + 5—11 et 

_ 11 15 8 18 0 0 

j-, —, -—=>2.2, 3, 1 6, —J o 

27 x* u =39, 55, 30, 74 

J ■ 

[27 x\j = 0 2 -f 2 2 +2 2 +l a +3 2 +5 2 =39 etc.] 

j 

T 2 16 2 

2 2 x\j =198, ^= 2 C _=12 8 ’ 

2 H 2 =l/5 [ll“+l5 a +8 ! +(-18) 2 ]=l46-8 

i ni 

The three sums of squares are therefore. 

27 27 (x u -*) 2 =27 27 -T 2 /iV=198-12-l = 185-2 

* j t J 

2 27 (x<j —XiY=2 2x 2 ij—2Ti 2 lrit= 198 —146*8=51*2 

l J i J • 


z Vi (Xi — x)*=2 T?ln t -T*IN= 134. 

i 

In tabular form. 


Source of 
variation 

D F. 

\ 

Sum of 
squares 

Mean 

square 

Between 

varieties 

CO 

II 

v-H 

1 

134 

i | 4 =44-67 

with 

varieties 

20-4=16 

51-2 

512 -3 2 

16 



44 67 
32 


= 13-96 



Analysis of Variance 


355 


For v, = 3, acd v=17, F 0 05 = 3 24 Thus the value of F is 
highly significant since 13*96 is greater than 3 24. Since the esti¬ 
mates of variance between variaties and within variaties are signi¬ 
ficantly different, the experiment as a whole does indicate signi¬ 
ficant variation in the yields of varieties. 

Ex 2 For varieties of potato are planted , each on five plots 

of ground of the same size and types, and each variety is treated with 
five different fertilizers. The yi lds in tons are as follows ; 


\ fertilizer 

Variety j \ 

1 | 

1 

T9 

O) 2 

2*5 

3 

1 7 

4 

2-1 


O') 


2 

3 

4 

5 

2-2 

26 

1*8 

2 1 

19 

23 

2 6 

2-2 

1 9 

2-2 

2*0 

21 

T8 

2*5 

2 3 1 

2-4 


Perform . n analysis of variance and show whether there is any signi - 
ficent difference between the yidd of different varieties d ie to differ¬ 
ent fertilizers. [B Sc Agra 661 

Here the number of classes is h = 5, and the number of groups 
is k=4. Consequently 7V=5x4=20. The arithmetic is simplified 
by shifting the origin to x=2 and taking the unit of measurement 


, a,/— 2 

as 1/10 tons. Thus u l} 


Now we may rewrite the table : 


-1 

5 

-3 

l 

from which fire have 
T, = 2, -2, 16. 


2 

-1 

-1 

-2 


6 

3 

2 

5 


-2 

6 

0 

3 


1 

2 

1 

4 


7, 8, 7=31 [7; = — 1 -F5 — 3-f-1 = 2 etc.] 


•5, 4, 1*75, 2 


Tj = 6, 15,-l.H, 7=31 

-2 2 16 7 8__.. 

Ui ~4 ’ 4 * 4 ’ 4 * 4 * 

Hence E u>,j =36, 10, 74, 49, 22 [ 27 u\ , = (D 2 +(5) 2 -f-(-3)* 
y y 

+(1) 2 =36 etc.) 


356 


Mathematical Statistics 


and SStflij =101, yvr = 20*= 4V05 ’ 

2 (2 9 +(—2) 2 4-16 2 -|-7 2 +8 2 ) 

1 

=94*25 

2 ——^=94 25-4^-05=46-20 

, 4 20 

Tj=6, 15,-1 ,12 [Here the elements in each row are added] 
i/, = §-, £ - 5 -. I 1 —1-2, 3, -0*2,2 3 

2 7V5=l/5.(6 2 +15*+(-l) ? 4-H 2 )=76-6 i 
j 

2 S-II =76 6-48 05=28-35 
j 5 20 

The sums squares are therefore : 

f 46-20 

— — „ Ti 2 T 2 

f * («;-«>»= f =48 55 

2 2 {u u -uy=2 2 u\,~^ =191—48-05 = 142-95 
z s (w,-; — Ut — ~Uj-\~u)-= 142 "95— 46*20 — 28*55=68 20 

i J 

The tabulation of results is : 


Source of 
variation 

D.F. 

Sum of 
square 

Mean square 

I 

F 

Between 

Fertilizers 

ji 

T 

\n 

46-20 

46 , ;0 —11-55 

4 

1155 _ 2 . 3 
5*o8 

Between 

Varieties 

4-1=3' 

28 55 

1 

28 3 55 9 52 

21 2 -i. 67 

5*68 

Error 

12 

1 

i 

68 20 

6 7 2 ° *« 

1 ' 2 I 


Toial 

19 

112-05 


1 



Analysis of Varionce 


357 


For Vj=4 and v 2 = 12, F 0 li =2‘3S. The calculated F i.e. 2 03 
is less than 2'36, the difference between fertilizers is not significent. 
for v,= 3, and v 2 =12, F 00 s=3 49 This is much greater than 1*67 
Hence the difference between varieties is also not significant. 

Exercises 

1. Give two illustrations where the technique of variance is 
useful. In one way classification of data in k groups, show that 
the total sum of the squares caa be expressed as made up of the 
sum of the squres ‘between the groups’ and ‘within groups’. Pre¬ 
pare the analysis of variance table to test the homogenity of means 
of the k groups. 

Show that when k=2, the ‘bitween group,’ sum of squares 
reduces to 


n,n 2 


(Xx-X 2 ) 2 t 


n,+n 2 

where n x and n 2 are the sizes of the two groups and x t and x 3 are 
the group means. 

Hence, or otherwise, show that the r-test for the significance 
of the difference between two sample means is a special case of 
the F lest for k samples. 

2. State the mathematical model used in analysis of variance 

in a two way classification. Explain the hypothesis to be tested. 
Discuss the advantages of this method over one way classification 
if any, [B. A. Bombay’ 69] 

3. Xi f (/«=!, 2, ..., //, /=!, 2, ... ,k) are independent variats 
with E (Xu )=n and Var (x tJ )=c 2 for all /, j. 

Find the expectations of 

k k n 

n 27 (x,— x) a and 2 27 ( x t , — x,) 2 
i = l / = 1y =1 


where 





27 27 X,, 
i J 


show how yaur results lead to a criterion for testing the equality 
of means of normal populations having the same variance and 
explain how the test is carried out. [B Sc. Nagpur ’68] 

4 Explain what you understand by ‘Analysis of variance’. 
State the assum phaces. 

There varieties of coal were analysed by four chemists and 
ihe ash-content in the varieties was found to be as under : 



358 


Mathematical Statistics 


Chemists 

Varieties 1 2 

A 8 5 

B 7 6 

C 3 6 

Do the varieties differ significantly in ash content ? 

[B. Sc. Kik. *69] 

5. A test was given to five students taken at random from 
the fifth class of three schools of a town. The individual scores 

are : 

School I 9 7 6 5 8 

School II 7 4 5 4 5 

School III 6 5 6 7 6 

Carry out the analysis of variance, and state your cone u 

sions. I®- Sc - Lucknow 6 °1 

6. Three processes A, B and C are tested to see whether 
their outputs are equivalent. The following observations of 

outputs are made : 

A : 10, 12, l3, 11, 10, 14, 15, 13 
B : 9, 11, 10, 12, 13 
C: 11, 10, 15, 14, 12, 13 

Carry out the analysis of variance and state your conclusion. 

[B. Sc. Bom. ’o9] 

7. Show to split up the total sum of squares of wheal A, B, 
C which were tested for their yield. Each of five blocks were 
divided into three plots and plots of each block were assigned at 
random to the three varieties. Perform the analysis of variance, 
given the following: 

(1) Total yields from all plots : 47 maunds 

(2) Crude sum of squares : 183 

(3) Block totals : 13, 5, 4, 14, 11 

(4) Variety totals : 16, 13, 18. [B. Sc. Krk. *69] 

8. In comparing the effects of 3 fertilizers Au A* and 
varieties B l9 B %9 B iy of corn, the following data were obtained 

for yield s : 



A x 

i4 a 
A 3 


Bx 

B % 

*8 

B t 

15 

20 

22 

20 

20 

30 

32 

28 

20 

35 

38 

32 




Analysis of Variance 


359 


Are the fertilizers equally effective ? Is there no difference 
between varieties. (B. Sc Bom. ’69] 

9. The following table gives the yields of six varieties of a 
crop in an experiment arranged in six randomised blocks with six 
varieties. Test for blocks and variety differences. 


Varieties 

A 

3 

C 

Blocks 

D 

E 

F 

I 

194 

170 

223 

214 

180 

182 

II 

208 

196 

226 

208 

223 

206 

III 

186 

183 

222 

216 

167 

203 

IV 

221 

160 

208 

216 

211 

196 

V 

203 

213 

202 

217 

201 

228 

VI 

201 

182 

207 

181 

171 

214. 


[B. Sc. Sar. Patel *69] 





